search.noResults

search.searching

dataCollection.invalidEmail
note.createNoteMessage

search.noResults

search.searching

orderForm.title

orderForm.productCode
orderForm.description
orderForm.quantity
orderForm.itemPrice
orderForm.price
orderForm.totalPrice
orderForm.deliveryDetails.billingAddress
orderForm.deliveryDetails.deliveryAddress
orderForm.noItems
BIG DATA, SMART DATA AND BIG ANALYSIS


WHAT CAN THE PETRO-INDUSTRIES LEARN FROM BIG PHARMA AND THE ALLOTROPE FOUNDATION, AND WHERE SHOULD THE FUTURE LIE?


At fi rst glance the Petro industries may not seem to have much in common with the Pharma industries. Petro produces vast quantities of bulk fuels, refi ned and petrochemical products usually from large continuous processes which at the minimum specifi cation level must be fungible between suppliers. Conversely, in Pharma individual companies produce patented, pharmacologically active pure compounds in small quantities usually from batch processes.


However, when it comes to the analytical world they would seem to have more in common. Both use a wide range of elemental and molecular characterisation techniques either as standard methods, to demonstrate product compliance and quality, or bespoke methods in research, development and problem solving activities. As the sophistication of analytical methods has developed the volume and complexity of the analytical data produced has increased dramatically, for example through recent developments in comprehensive chromatographic techniques and high resolution, high mass accuracy, mass spectrometry (1)(2). In addition, much of this data is information rich at the molecular level and can offer new opportunities for developing structure-property relationships to improve process effi ciency and develop new products with differentiated performance. Two examples of this could be


• The detailed molecular information on crudes and process stream composition provided by Petroleomic approaches (3) when combined with historical process data could lead to new routes to produce and value crudes, optimise refi ning strategy and identify key components leading to processing issues such as fouling and corrosion.


• By combining detailed major minor and trace component analysis of formulated fuels or lubricants with test data from rig and engine performance tests and consumer trends using advanced data analytics it should be possible to identify key chemical components and additives which generate differentiated performance and therefore premium prices and margins in the market.


The general hypothesis is: If I have more data at my fi ngertips – then I will have more answers


But, this is not necessarily the case as real-world data is messy data, fi lled with inconsistencies, potential biases, and noise. Therefore, to be able to effectively mine this wealth of data to generate increased performance we need an innovative approach to “Big Data” to generate “Smart Data” with a view to achieving value through “Big Analysis”.


So, if we consider the current situation in typical petro industry laboratories around analytical data we can identify several key constraints including:-


• Data Silos, even between laboratories in the same company


• Incompatible instruments and software systems, often with proprietary 3rd party data formats • Legacy architectures are brittle and rigid with low connectivity.


• Critical knowledge resides in people’s heads, little common vocabulary


• Data schemas are not explicitly understood


• Lack of common vision and language between business units and scientists


• The Petro-Industries deals with products which are complex molecular mixtures often with many thousands of differentially active components


All of this this makes it almost impossible to move forward to a world of smart data and big analysis unless these constraints can be overcome.


The 4 Vs of Big Data


Big data exponents often refer to the 4Vs of big data – namely Volume, Velocity, Variety and Veracity as illustrated in Figure 1.


Volume and Velocity are normally the areas covered by conventional big data analytics and have shown dramatic development over the past few years thanks to the technology developments of many companies such as Amazon, Google, Netfl ix, etc. Data Variety on the other hand speaks to the increasing types of data sources available to companies for analytics – something that continues to grow at a rapid rate (e.g., image data, video data, unstructured text data). Data Variety speaks to data complexity and here semantic technologies have a clear advantage due to their graph-based ability to connect various data on a conceptual and class-based level. Veracity refers to data uncertainty and abounds in scientifi c and experimental data. Here statistical and probability analysis is required with mathematical clustering techniques providing clear advantages – often referred to in contemporary circles as Data Science.


Figure 1: The four Vs of big data


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44  |  Page 45  |  Page 46  |  Page 47  |  Page 48  |  Page 49  |  Page 50  |  Page 51  |  Page 52  |  Page 53  |  Page 54  |  Page 55  |  Page 56  |  Page 57  |  Page 58  |  Page 59  |  Page 60  |  Page 61  |  Page 62  |  Page 63  |  Page 64  |  Page 65  |  Page 66  |  Page 67  |  Page 68