The Various Roles That Act as Both a Checkpoint and a Helper Throughout the Lifecycle

In data analysis and consulting, there are different jobs found along what is known as the “data lifecycle. The data lifecycle is a representation of all stages of data throughout its life from creation to dissemination. Today, we’re here to discuss how various staff roles are utilized in a typical data lifecycle. Also, which are the most important roles.

The table below lists the major staffing roles that are found throughout the lifecycle. This includes the job title, role type (mission portfolio, data portfolio, and data technology expert). Furthermore, it includes a brief description of each particular role.

Staffing Roles

TitleDescription

Mission Portfolio Roles

Sponsor, Mission Subject Matter Expert (SME)They know the questions they want to ask, what they do today, and what they want to do when next
Data Leadership RoleThis is the Representative of the Data Organization once the organization is made into a formal division or function. This includes responsibility over governance, policy, and data management leadership. Depending on maturity, may be a Chief Data officer, Data Management Lead, or Data Management Working Group Rotating Lead
Business AnalystSupports the Sponsor in documenting and detailing the questions being asked.  Helps identify data sources, SMEs and supporting material.
Data Catalog ExpertThey know the data they want to use and general assessment of readiness and who to talk to.  Understand the metadata that needs to be present in order to make data usable and user-friendly.

Data Portfolio Roles

Data SuppliersThey provide the data and know their source
Data AnalystThey provide the modeling analysis of the supply of the objects, fields, values, rules, and qualities need 
Data EngineerThey provide the ELT Data Pipeline to maintain the pipeline and lake for the supplies They also automate the supply of the data.
Data ArchitectDesigns the data system, outlines the dataflow and defines and designs how and where the different roles utilize the system.
Data ScientistThey know how to translate via relational or AI/ML script the reference, temporal, financial, and/or geospatial for integrating data across supplies within the pipeline prior to being available to the enterprise.  Understand methodology and have a toolchest of AI resources that can convert data into knowledge.
Data DeveloperThey can make the final end transformations to make the end product – a service, product, map, app, API, etc. They also automate the delivery of the data products.

Data Technology Expert Roles

Data Solution ArchitectSupports the requirements definition and leads the solution design analysis and recommendation for various platforms, flows, technology stacks, and planning from MVP through features through Architecture qualities
Platform / App DeveloperDevelop end user facing tools for interaction with the data platform and/or create data services for accessing the data.
Data Wrangler – Handles the likely large amounts of data quality issues working for the data developer and with Design roles to address data quality issues with the design solution assumptions and supplier. Usually requires strong transformation scripting skills.
Cloud DevOps – Initially, they design the cloud environment and deployment needed to facilitate the supply, developer, and user flows to meet service level expectations.  Infrastructure as Code.
Data Administrator – They maintain data environment administration of the lake, cube, fabric, warehouse, etc. including performance and data lifecycle service management monitoring. Data platform or data system administrator.
Cloud DevOps Administrator – They provide the managed cloud service environment security, IT Compliance, and service levels agreements.

So What Roles Help Out The Different Parts of Data Lifecycle Management The Most?

To answer that question, first, we need to understand that there are typically three key phases of the data lifecycle: 

  • Planning – Planning focuses on Customer Relationship Management for the Needs and Requirements Lifecycle. Also, the Cost Benefits Analysis of data assets, Source Data Acquisition, and Quality Planning and Funding Planning.
  • Production and Data Lifecycle Management – Process Evaluation and Improvements, establishing improved automation in data pipelines, ETL, ELT, and maintaining the end-to-end stewardship of the data (define, inventory/evaluate, obtain, access, maintain, use/evaluate, and archive). First and foremost, this group extracts source data, loads, and executes transforms to support delivery’s goals.
  • Service Delivery – Create data derivatives for delivery (downloads, services, packages, composite products), architect and support service platforms APIs. Also, manage infrastructure to be FAIR, improve discovery, and encourage and enable community and collaboration.

To further assign responsibility to each major role within these three parts of the data lifecycle, consider the concept of RASCI (Responsible, Accountable, Supports, Consults, Informs). A RASCI is a responsibility assignment matrix that clarifies the responsibilities of a particular role during preparation and implementation. Furthermore, the RASCI table below groups each role by responsibility (RA being primary, SCI being support-based). RA (the table is filled with an R, an A, or both) means it’s a leadership role. Meanwhile, SCI (the table is filled with an S, C, I, or any combination) means it’s a knowledge/support role. You can look back at the beginning of this paragraph where we list the five parts of RASCI for reference.

The graph below the table further categorizes roles by which part of the data lifecycle it falls under

TitlePlanningProduction OperationsDelivery Services
Mission Portfolio Roles
Sponsor, Mission SMERA
Data Mgmt Leadership RoleC
Business AnalystSC
Data Catalog ExpertSCI
Data Portfolio Roles
Data SuppliersCI
Data AnalystCI
Data EngineerRA
Data ArchitectSCCISC
Data ScientistSCI
Data DeveloperSCIS
Technology Portfolio Roles
Data Solution ArchitectCIRA
Platform/App DeveloperSCI
Data WranglerS
Cloud DevOpsSCI
Data AdministratorS
Cloud DevOps AdministratorS

The Takeaway

Note how the planning stage involves the four of the roles in the “mission portfolio section”. Some believe that without the proper use of these roles to plan the lifecycle of data, the entire operation falls apart. Next, we have “data portfolio roles”, which are far more strewn about than the former “group” of roles. However, we find a good portion of them involved in S, C, I, or some combination of the three. Finally, we have “technology portfolio roles”. We find them more in production operations and delivery services. Also, we find a good portion of them involved in S, C, and I, much like the data portfolio roles.

Following the logic more parts of a lifecycle make a role helpful (quantity), Data Architects are arguably most important. This is because they spread their responsibilities across all three parts of the data lifecycle. Between this, and the paragraph just before the graph, it’s ultimately a matter of opinion. So look back at the first table, and decide for yourself who helps out the most.

Okay, But What Roles Help Out the Most During The Project’s ‘Data Lifecycle’?

We are now taking a look at the question of who helps out the most in the data lifecycle from a completely new perspective. At Xentity, we closely follow the concept of Discovery-Define-Design-Develop-Implement-Maintain. So, we will now re-examine the roles once more with the same rules. In the table below, blue defines a leadership role and green represents a support role. In the section that follows we will examine another graph that questions whether you can say which role helps out the most from a quantitative perspective.

Quantitative Perspective

TitleDiscoveryDefineDesignDevelopImplementOperate/Maintain
Mission Portfolio Roles
Sponsor, Mission SMEMission Mission 
Data Mgmt Leadership RolexxMissionMission
Business AnalystxTechxx
Data Catalog Expertxxx
Data Portfolio Roles
Data Suppliersxxx
Data Analystxxx
Data ArchitectTechxMission 
Data Scientistxxx
Data Developerxxxx
Technology Portfolio Roles
Data Solution ArchitectxTechMission 
Platform/App DeveloperxxTech
Data Wranglerxxx
Cloud DevOpsxxTech
Data AdministratorxData Tech 
Cloud DevOps AdministratorxxIT Tech

The Takeaway

So, following more of a quantitative perspective, the roles of Data Management Leadership, Business Analysts, and Data Developers arguably help the data lifecycle the most because they have roles in 4 out of 6 of the development/implementation processes we at Xentity tend to follow so closely. Furthermore, like in the previous graph, you see that certain roles are far more involved throughout the lifecycle. By contrast, following the idea success starts at the top, you could argue Data Management Leadership is the most helpful. Not only do we find it involved in four points in the development process, it has two leadership roles. 

But, much like the results of the previous graph, your mileage may vary (your experience might be different). So, we encourage you to look at the first table once more. Using the graph the description of each role, decide for yourself which role helps the data lifecycle the most.

Each Role Matters

A lot goes into supporting the data lifecycle, from beginning to end. It could be your Mission Portfolio Roles (Which determines what kind of data is needed and what questions need to be asked). Or it Data Portfolio Roles (Provide ETL Pipelines and microservices). Or it could even be the Data Technology Experts (Providing data platforms and design environments for data). Regardless, you can make the argument that some roles help the data lifecycle more. Especially if you look at how often we use them from a quantitative perspective. But make no mistake: each role is useful in the life cycle.

At Xentity we love data and have often involved ourselves in various points in the data lifecycle because of that love for data. Check out our services page to learn more about how we offer our services for your data.