Words by Richard Bumann
Data Science Consultant at ERNI

Manage expectations, prove and challenge results, commu­nicate outcomes. In each stage of your data science project, these are some of the unspoken expectations your team and leaders will have of you.
Navigating these challenges is not easy, so let’s take a closer look at what pitfalls to avoid, what tough decisions to make and whose engagement is the most important during each stage of a data science project.
The individual stages of your data science project as described below don’t necessarily happen in a strictly sequential order. You can move back and forth between them or repeat the cycle a number of times to tackle new challenges. Always make sure you take a conscious and well-founded decision when proceeding to the next stage, as with this change, you’ll have a change of the stakehold­ers themselves and their management.

Business understanding: Make sure all stakeholders understand the business goals of your project

During the initial stage, people should not only understand the benefits of the project but it’s also just as important to make them understand what the project cannot deliver. Be careful when defining and limiting the scope of the project.
Stakeholders in this phase:
Business end users, business analysts and data scientists.
Obstacles:
If the vision or idea of what should be achieved is too broad, it has to be narrowed down. Keep in mind that business end users and framing conditions are multifac­eted and include more than just people, e.g., legal bodies or security regulations.
Difficult but correct decisions:
Abort the project if the business idea is not feasible or the benefits are not viable.
Engagement of stakeholders:
Illustrate the possibilities with well-designed examples and set realistic expectations.

Data understanding: Let’s discuss available data

In this phase, you’ll be mapping the data landscape and discussing data storage and possibilities to integrate and merge data. It is also important to assess the quality and completeness of the data.
Stakeholders of this phase: 
Business analysts, data scientists or analysts, data engineer, IT.
Obstacles: 
(a) Miscommunication between business analysts and data analysts; (b) miscommunication between data scientists and the data engineer, leading to poor identification of necessary and available data sources; (c) bad relationship between IT and the data team because of different goals; (d) missing out on opportunities to uncover poor data quality and data gaps.
Engagement of stakeholders:
Picture the benefits of the data project. When working with IT, engage them in the process so they don’t feel left out and you won’t end up with requests such as “extract data for us immediately”.

Data preparation: Getting the data in shape

In this phase, make sure you have the right data in the best quality possible.
Stakeholders in this phase:
Data scientists, data analysts and data team, business stakeholders.
Obstacles:
Incomplete or ‘dirty’ data, missing resources from the IT department, or missing engineering that could help access data and improve its quality.
What can go wrong:
(a) If the scientists don’t talk to the business and other professional stakeholders, they might miss some important facts needed to prepare and clean the data for a proper analysis; (b) if cleaning is needed, there might not be enough resources to clean data for the project (which should be done by professionals). Data teams need to include cleaning in their planning, and occasionally involve data engineers to clean data at the source.
Engagement with stakeholders:
Clearly communicate how important it is to have a clean database for a correct analysis. Assign resources from IT.

News from ERNI

In our newsroom, you find all our articles, blogs and series entries in one place.

  • 22.11.2023.
    Newsroom

    Recognising trends: An insight into regression analysis

    Data plays a very important role in every area of a company. When it comes to data, a distinction is made primarily between operational data and dispositive data. Operational data play an important role, especially in day-to-day business. However, they are not nearly as relevant as dispositive data. This is because these data are collected over a longer period of time and provide an initial insight into the history or the past.

  • 08.11.2023.
    Newsroom

    Why do we need digital transformation for medical devices?

    For hospitals, it is not up for discussion as to whether they want to digitalise. The increasing age of the population in western countries and the progressive shortage of medical professionals mean that without digitalisation, the healthcare system will not be able to provide the quality that patients want in the future.

  • 25.10.2023.
    Newsroom

    Mastering the challenges of mobile app testing: Strategies for efficient quality assurance

    Discover the unique challenges faced in testing mobile applications and learn how to overcome them effectively. From selecting suitable devices and operating systems to leveraging cloud-based test platforms, test automation and emulators, this article provides seven essential strategies for optimising your mobile app testing process.

  • 11.10.2023.
    Newsroom

    Incorporating classical requirements engineering methods in agile software development for a laboratory automation system

    Traditional agile methodologies can sometimes struggle to accommodate the complexity and regulatory requirements of laboratory automation systems, leading to misalignment with stakeholder needs, scope creep, and potential delays. The lack of comprehensive requirements documentation can result in ambiguous expectations and hinder effective communication among cross-functional teams.

  • 27.09.2023.
    Newsroom

    Unveiling the power of data: Part III – Navigating challenges and harnessing insights in data-driven projects

    Transforming an idea into a successful machine learning (ML)-based product involves navigating various challenges. In this final part of our series, we delve into two crucial aspects: ensuring 24/7 operation of the product and prioritising user experience (UX).

  • 13.09.2023.
    Newsroom

    Exploring Language Models: An overview of LLMs and their practical implementation

    Generative AI models have recently amazed with unprecedented outputs, such as hyper-realistic images, diverse music, coherent texts, and synthetic videos, sparking excitement. Despite this progress, addressing ethical and societal concerns is crucial for responsible and beneficial utilization, guarding against issues like misinformation and manipulation in this AI-powered creative era.

  • 01.09.2023.
    Newsroom

    Peter Zuber becomes the new Managing Director of ERNI Switzerland

    ERNI is setting an agenda for growth and innovation with the appointment of Peter Zuber as Managing Director of the Swiss business unit. With his previous experience and expertise, he will further expand the positioning of ERNI Switzerland, as a leading consulting firm for software development and digital innovation.

  • data230.08.2023.
    Newsroom

    Unveiling the power of data: Part II – Navigating challenges and harnessing insights in data-driven projects

    The second article from the series on data-driven projects, explores common challenges that arise during their execution. To illustrate these concepts, we will focus on one of ERNI’s latest project called GeoML. This second article focuses on the second part of the GeoML project: Idea2Proof.