Welcome to the fifth post in our series! As you may know, digital transformation is changing the way geological work is conducted and how innovative solutions are developed. Data science plays a crucial role in this evolution. No longer the preserve of ‘big tech’, data science is quickly becoming indispensable in our industry as we manage increasing data complexity and explore its potential in the energy transition.
In this blog post, we will dig into what data science means for us and how it can help us deliver better, faster, and more accurate results.
What is Data Science and Why Does it Matter?
Data science is an interdisciplinary field that combines statistical analysis, computer science, and domain expertise to extract meaningful insights and knowledge by uncovering patterns and predicting outcomes. Ultimately, this allow us to make better, more informed decisions. For geologists, this translates to transforming vast amounts of data—from rock analysis to well logs and third-party data—into actionable intelligence.
Machine learning, a subfield of AI, is a foundational component of present-day data science and is incredibly effective at extracting new patterns and insights from large datasets, often finding connections that are likely impossible to identify manually. Machine learning algorithms can analyse across multiple dimensions simultaneously, across large volumes of data that are often far too large for manual interpretation. Utilising techniques such as un-supervised clustering, anomaly detection, and deep learning, machine learning algorithms can identify intricate or hidden relationships within datasets, leading to insights that would not be obvious or even possible to identify from manual analysis.
Within geological assessment, we can utilise these machine learning components in many ways, of which some examples are provided below:
- Use unsupervised clustering and/or self-organising maps to detect facies-controlled patterns in wireline log data – i.e., electrofacies identification though machine learning. As more facies data becomes available, the relationship between wireline logs and core facies is constantly interpreted by the machine learning algorithm, and the accuracy in electrofacies identification increases. Ultimately, this increases the value of core-derived facies interpretation, as the results of that analysis can be statistically applied to a much larger wireline log dataset. Importantly, this process also allows geologists to ‘ground truth’ the initial machine learnt results before pushing the algorithm across the larger dataset – if the results are not as expected based on the known core facies, then the number of clusters must be tweaked, etc.
- Use multiple dimension pattern analysis to find new patterns in big datasets, such as looking for new relationships between large number of differing species within a basin, to determine if building new, machine-led biozonation’s can lead to improved well correlation
Computer vision, particularly image analysis and its subbranches of object detection and image classification, is increasingly, playing an evermore pivotal role in the application of data science within the geological sector. By applying these image analysis techniques to high-resolution scans of palynological or nannopalaeolotological slides, it is possible to begin to automatically identify certain species, leading to more augmented or semi-automated data collection. Ultimately, whilst this may lead to a reduction in the time required for analysis per sample, it can also increase the statistical validity of quantitative results generated as, in theory, this type of augmented analysis should remove human operator bias when analysing each sample.
Best Practices for Data Collection, Management, and Storage
For the effective application of data science in any scenario, the collection management and storage of the structured and unstructured data to be analysed must be effective and robust. As such, establishing key practices for data collection, curation and storage is critical. These could include:
- Collecting Quality Data: Ensure the accuracy, consistency, and relevance of your data by adhering to standardised formats and protocols. This minimises errors and ensures that all data points can be easily integrated and analysed.
- Organising Data Effectively: Implement a robust data management strategy to keep everything organised and constrained by industry standard data models. This is particularly important when creating a a centralised “data lake”, to facilitate easy access, retrieval, sharing, and analysis of datasets/datatypes within this lake. It is helpful that metadata (data about data) is maintained centrally, so everyone knows what each dataset represents and how it was collected, and where it should belong.
- Prioritising Data Security: Safeguarding sensitive data through encryption, access controls (such as row level entitlement), and regular backups is critical, particularly with increasing concerns about data privacy and cybersecurity. When using/building any cloud-based solution, a rigorous penetration testing plan is important, as it will identify any gaps/potential security vulnerabilities within the solution being built and enable prompt fixes before any systems go live (as well as provide an auditable evidence trail). Multifactor Authentication (MFA) is, where possible, a must-have when building solutions that will, eventually, host or administer data to people outside of your organisation
- Leveraging Cloud Storage: Utilising cloud-based storage solutions is critical to enhance scalability and provide easy, global access to datasets for both internal and external users, such as client partners.
Training and Upskilling our Staff in Data Science Techniques
To fully harness the power of data science and the innovation that it can bring to our or any organisation, we must invest in our team’s skills and knowledge. This involves:
- Offer Training Programs: We are investing in training programs, such as online code academies and on-the-job opportunities, that cover the basics of data science, such as data analytics, machine learning, and programming languages like R, SQL and Python.
- Fostering a Culture of Experimentation: We are trying to foster a culture where team members are encouraged to experiment with new tools, share their findings, and continuously learn from each other. It is ok to fail!
- Collaborate with Academia: We have ongoing discussions and opportunities to partner with universities or research institutions to stay on top of the latest data science developments and tap into fresh talent and ideas.
Final Thoughts
Data science offers immense potential to revolutionise our industry. By adopting a data-driven mindset and investing in the necessary skills and tools, we can deliver superior insights, reduce costs, and enhance client satisfaction. As we continue our digital transformation journey, we will overcome challenges, embrace innovation, and shape the future of geological consultancy.
In our next post, we’ll discuss how to keep the momentum going, from overcoming common challenges to continuously evolving the digital strategy of PetroStrat. Stay tuned — the journey is just beginning!
Blog series brought to you by Our Digital and Data Team (part of the New Ventures department)
Other Blogs In This Series:
Blog 1: The Future of Geological Consultancy, Embracing Digital Transformation
Blog 2: Why Digitalisation Is Crucial For Geological Consultancies
Blog 3: Building a Digital Foundation: How We Took Our First Steps Towards Digitalisation
Data & Digitalisation Blog 4 : From Laboratory to Cloud: Digital Tools and Technologies in Geological Consultancy
Data & Digitalisation Blog 5 : Harnessing the Power of Data Science for our Consultancy
Mark Weldon – New Ventures Director
Mark Weldon is a geoscientist with over 30 years of experience in the energy sector. Having co-founded and helped grow PetroStrat into a leading consulting firm, Mark has a proven track record of applying geological and stratigraphic expertise to complex subsurface challenges. Currently focused on the energy transition, Mark is dedicated to leveraging PetroStrat’s world-leading geoscience expertise to drive a sustainable future.