Data Mining[edit | edit source]
How does data mining work?[edit | edit source]
Data Mining is the process of analyzing data from different perspectives to discover relationships among separate data items. Data mining software is one of several different ways to analyze data and can be used for several different reasons. It can be used to cut costs, increase revenue or for both. [http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm
The object of data mining is finding unknown data characteristics, relationships, and dependencies that were not know before-hand. Traditional data analysis tools were designed so that when an end user discovers a problem, they are responsible for action to be taken in solving the problem. If the end user decides not to take action or does not discover one, then no action is taken. Data mining does just the opposite of a traditional data analysis tool. Data mining is proactive in taking action in looking for any kind of issues such as, anomalies and possible relationships. This information helps a user to gain an advantage with this knowledge that is given to them
Data mining is used primarily in end-user queries to analyze patterns and relationships between data. Usually four different types of relationships are sought.
1. Classes: Data is sorted to find data in groups.
2. Clusters: Data items are grouped based on logical parameter or user preference.
3. Associations: Data can be used to find between two types of data.
4. Sequential patterns: Data is used to determine patterns and trends.
Data mining consists of five major elements:
Extract, transform, and load transaction data onto the data warehouse system.
Store and manage the data in a multidimensional database system.
Provide data access to business analysts and information technology professionals.
Analyze the data by application software.
Present the data in a useful format, such as a graph or table
Stages of Data Mining[edit | edit source]
There are three separate stages of data mining, (1) exploration, (2) model building, and (3) deployment.
This stage starts with preparing data such as data cleaning, transformation, selecting records etc. Depending on the nature of the problem, the first stage of the process of data mining may involve a simple choice of prediction the regression model, to identify the most relevant variables and determine the complexity and/or the general nature of models that can be taken into account in the next stage.
Model building and validation
This stage involves choosing the best model based on there predictive performance. This sounds like an easy task but can be difficult. Several different methods may be used to determine which model is best for you.
This stage combines the previous two by implementing the model you chose and applying it to the data to generate predictions or estimates of the outcome.
Data Mining Company business practices[edit | edit source]
There are many data mining companies spread across the United States and the world. Many data mining companies specialize in mining specific types of data for specific industries or specific areas of a business, such as sales, employee efficiency, or supply chain efficiency. These companies follow certain business practices and procedures to ensure that clients’ data is not lost, stolen, or used against them. One common practice is the data mining company does not retain any raw data, only copies of the reports for future use by the client. Another common practice is to have the employees’ only work on projects for a single client to ensure that any proprietary information or raw data is not transferred to or used in the reports of another client.
When data mining companies are starting projects they want to gather as much raw data as they can, no matter how innocuous it may seem, they want as much data as they can get. The reason behind this is with the process power of computers analyzing data has become much faster. In addition, data mining companies have discovered that people (meaning the intended target group,) will do two seemingly unrelated tasks together regularly or specific tasks on specific days, so, the more data points the better your results.
In conclusion, data mining companies work in a confidential environment that many businesses rely on to make decisions from the day to day operations to five year plans. The longer the companies work with a specific client or in a specific industry, the better they will be at predicting when, where, and how things are going to happen.