Data Input
“Automation is used where possible to collect data. Manual data collection processes are quick and easy for the data collector to follow. They understand the requirements of the data fields have the competence to use the data capture systems.”

Data is an asset. Companies spend a larger amount of money in the definition of business processes, implementation of computer systems and use manpower to create this asset. We can think of the output of the data collection system as an asset with a value., that value can easily be millions of pounds or dollars, I’ll cover how to calculate this is a later section.
At the front end of the data creation system are the data inputs. If the inputs are missing or incorrect then the data creation system cannot create quality outputs, no matter how well the rest of the system functions. Time, resources and energy will still be consumed throughout the data creation system however it will be producing an asset that can’t be used to support good business decisions.
Think of it like a car manufacturing plant. If the quality of the steel used to construct the car is not on spec and some components are missed out during the manufacturing process the process will continue. A car will still be produced but the company may not be able to sell the car at the end of the process. They will have an asset with practically no value. They will have an additional non-value adding cost because they must still store and eventually scrap the unusable car, both of which will cost more money.
If the data is considered a usable asset and is used as information to support poor business decisions then this can be even more potentially harmful to the business.
Classes of Input defects
1. Data not collectable. – There is no source of means of collecting the data required.
2. No data storage location. – The is no defined location, i.e., database table and/or data capture field, available to collect and store the data record.
3. Data not collected. – There is a field for the data, but it is not collected consistently.
4. Data collected, but not reportable. – The data is collected but entered into an incorrect field of the data record or into unstructured free text area.
5. Inaccurate data. – The data has a field in the system. It is recorded, but it has an incorrect value.
Automated Data Entry
Automating data entry is generally the most preferred option. Automation is consistent and scalable. Automatic data entry can still be subject to the 4 types of input defect list above; however, there is one key difference. Automated systems and generally either always right or always wrong, and they are consistent and predictable.
If a data input defect is detected, then a programmer can usually fix the issues and may even have the capability to update the previous erroneous records. Once fixed, it will be collected correctly. These data collection defects are typically a result of human design deficiencies but once fixed, will generally record the data flawlessly. For example, a data historian is collecting vibration data from a vibration sensor. However, it has been recognised that the data is being recorded against the incorrect equipment record.
Automated data collection can be subject to the 5 faults identified above. Below are some aspects of automated data entry that could cause data input issues and the mitigation actions that can be put in place to reduce the impact of these errors occurring.
Defect type 1 – Data not collectable and Defect type 2 – No data storage location
This is not usually an issue for automated data collection as a pre-requisite to automating a system is to have a data source and location to store the data. The sources will typically be another database system, or a signal produced by a sensor or instrument. For example, vibration, current, temperature, rpm, humidity, run hours, etc.
Defect type 3 – Data not collected
Problems with data not being collected for automated system is typically due to communication or instrumentation failures. The data input instrument has a defect and is not collecting the data or is collecting the data but a communications issue preventing it updating the database.
Mitigation Measures: To prevent this it is normally possible to run a period routine to check if the data base is being updated as expected. The IT team can define a programme to monitor the source database on a pre-defined frequency and check that the database fields have been populated. If no data is found, a report or email can inform the relevant person. They can then investigate this issue.
Defect type 4 – Data collected, but not reportable
Again, this type of defect should not occur in automated systems. It is, however, possible that during the design stage of the automated data collection system the data can be mapped to the incorrect field or an incorrect data type, i.e., a number stored as a text value. This can normally be identified and fixed during the testing phase of setting up the automatic data collection system.
Defect type 5 – Inaccurate data
In the automated system this defect is typically a result of a defect in the instrumentation collecting the data, e.g., it has drifted out with is calibration range.
Mitigation Measures: It is more of a challenge to identify inaccurate data as opposed to missing data. This is because the data field will be populated but how do you know if the value is accurate?
If the differences between the actual values are small, it is difficult to identify if the data is correct or erroneous. You need to understand if small variances make a material difference to how the data will be used. This will help to understand what measures you need to put in place to monitor the data quality.
Condition monitoring is a good starting point, i.e., let the instrument tell you when it has a defect. Some newer instruments have self-diagnostic routines that can create an error signal when they detect the signal may be incorrect. Capturing this signal and monitoring this automatically, then triggering an alert to be sent to the relevant person or even displayed next to the data on the final dashboard.
Ensuring there is a maintenance routine in place for recalibration at an adequate frequency is the next line of defence.
Larger variations will be detected as part of the normal high/low trigger points that should be in place to monitor the operating envelope of the machinery. If these are triggered due to an error in the data value, then there will be investigated and confirmed as a data error if so.
Finally, it may require a human to sense check the data at an adequate frequency and look for values that seem out of place or inconsistent.
However, automated data collection should still go through a validation process. For example, a semi-automated validation process would be to email a validator with a sample every week or produce an exception report if data fields a missing due to an instrument or communcations failure.
Share on facebook
Facebook
Share on google
Google+
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on pinterest
Pinterest