The Various Roles That Act as Both a Checkpoint and a Helper Throughout the Lifecycle
In data analysis and consulting, there are different jobs found along what is known as the “data lifecycle. The data lifecycle is a representation of all stages of data throughout its life from creation to dissemination. Today, we’re here to discuss how various staff roles are utilized in a typical data lifecycle. Also, which are the most important roles.
The table below lists the major staffing roles that are found throughout the lifecycle. This includes the job title, role type (mission portfolio, data portfolio, and data technology expert). Furthermore, it includes a brief description of each particular role.
Staffing Roles
Title | Description |
Mission Portfolio Roles | |
Sponsor, Mission Subject Matter Expert (SME) | They know the questions they want to ask, what they do today, and what they want to do when next |
Data Leadership Role | This is the Representative of the Data Organization once the organization is made into a formal division or function. This includes responsibility over governance, policy, and data management leadership. Depending on maturity, may be a Chief Data officer, Data Management Lead, or Data Management Working Group Rotating Lead |
Business Analyst | Supports the Sponsor in documenting and detailing the questions being asked. Helps identify data sources, SMEs and supporting material. |
Data Catalog Expert | They know the data they want to use and general assessment of readiness and who to talk to. Understand the metadata that needs to be present in order to make data usable and user-friendly. |
Data Portfolio Roles | |
Data Suppliers | They provide the data and know their source |
Data Analyst | They provide the modeling analysis of the supply of the objects, fields, values, rules, and qualities need |
Data Engineer | They provide the ELT Data Pipeline to maintain the pipeline and lake for the supplies They also automate the supply of the data. |
Data Architect | Designs the data system, outlines the dataflow and defines and designs how and where the different roles utilize the system. |
Data Scientist | They know how to translate via relational or AI/ML script the reference, temporal, financial, and/or geospatial for integrating data across supplies within the pipeline prior to being available to the enterprise. Understand methodology and have a toolchest of AI resources that can convert data into knowledge. |
Data Developer | They can make the final end transformations to make the end product – a service, product, map, app, API, etc. They also automate the delivery of the data products. |
Data Technology Expert Roles | |
Data Solution Architect | Supports the requirements definition and leads the solution design analysis and recommendation for various platforms, flows, technology stacks, and planning from MVP through features through Architecture qualities |
Platform / App Developer | Develop end user facing tools for interaction with the data platform and/or create data services for accessing the data. |
Data Wrangler – | Handles the likely large amounts of data quality issues working for the data developer and with Design roles to address data quality issues with the design solution assumptions and supplier. Usually requires strong transformation scripting skills. |
Cloud DevOps – | Initially, they design the cloud environment and deployment needed to facilitate the supply, developer, and user flows to meet service level expectations. Infrastructure as Code. |
Data Administrator – | They maintain data environment administration of the lake, cube, fabric, warehouse, etc. including performance and data lifecycle service management monitoring. Data platform or data system administrator. |
Cloud DevOps Administrator – | They provide the managed cloud service environment security, IT Compliance, and service levels agreements. |
So What Roles Help Out The Different Parts of Data Lifecycle Management The Most?
To answer that question, first, we need to understand that there are typically three key phases of the data lifecycle:
- Planning – Planning focuses on Customer Relationship Management for the Needs and Requirements Lifecycle. Also, the Cost Benefits Analysis of data assets, Source Data Acquisition, and Quality Planning and Funding Planning.
- Production and Data Lifecycle Management – Process Evaluation and Improvements, establishing improved automation in data pipelines, ETL, ELT, and maintaining the end-to-end stewardship of the data (define, inventory/evaluate, obtain, access, maintain, use/evaluate, and archive). First and foremost, this group extracts source data, loads, and executes transforms to support delivery’s goals.
- Service Delivery – Create data derivatives for delivery (downloads, services, packages, composite products), architect and support service platforms APIs. Also, manage infrastructure to be FAIR, improve discovery, and encourage and enable community and collaboration.
To further assign responsibility to each major role within these three parts of the data lifecycle, consider the concept of RASCI (Responsible, Accountable, Supports, Consults, Informs). A RASCI is a responsibility assignment matrix that clarifies the responsibilities of a particular role during preparation and implementation. Furthermore, the RASCI table below groups each role by responsibility (RA being primary, SCI being support-based). RA (the table is filled with an R, an A, or both) means it’s a leadership role. Meanwhile, SCI (the table is filled with an S, C, I, or any combination) means it’s a knowledge/support role. You can look back at the beginning of this paragraph where we list the five parts of RASCI for reference.
The graph below the table further categorizes roles by which part of the data lifecycle it falls under
Title | Planning | Production Operations | Delivery Services |
Mission Portfolio Roles | |||
Sponsor, Mission SME | RA | ||
Data Mgmt Leadership Role | C | ||
Business Analyst | SC | ||
Data Catalog Expert | SCI | ||
Data Portfolio Roles | |||
Data Suppliers | CI | ||
Data Analyst | CI | ||
Data Engineer | RA | ||
Data Architect | SC | CI | SC |
Data Scientist | SCI | ||
Data Developer | SCI | S | |
Technology Portfolio Roles | |||
Data Solution Architect | CI | RA | |
Platform/App Developer | SCI | ||
Data Wrangler | S | ||
Cloud DevOps | SCI | ||
Data Administrator | S | ||
Cloud DevOps Administrator | S |
The Takeaway
Note how the planning stage involves the four of the roles in the “mission portfolio section”. Some believe that without the proper use of these roles to plan the lifecycle of data, the entire operation falls apart. Next, we have “data portfolio roles”, which are far more strewn about than the former “group” of roles. However, we find a good portion of them involved in S, C, I, or some combination of the three. Finally, we have “technology portfolio roles”. We find them more in production operations and delivery services. Also, we find a good portion of them involved in S, C, and I, much like the data portfolio roles.
Following the logic more parts of a lifecycle make a role helpful (quantity), Data Architects are arguably most important. This is because they spread their responsibilities across all three parts of the data lifecycle. Between this, and the paragraph just before the graph, it’s ultimately a matter of opinion. So look back at the first table, and decide for yourself who helps out the most.
Okay, But What Roles Help Out the Most During The Project’s ‘Data Lifecycle’?
We are now taking a look at the question of who helps out the most in the data lifecycle from a completely new perspective. At Xentity, we closely follow the concept of Discovery-Define-Design-Develop-Implement-Maintain. So, we will now re-examine the roles once more with the same rules. In the table below, blue defines a leadership role and green represents a support role. In the section that follows we will examine another graph that questions whether you can say which role helps out the most from a quantitative perspective.
Quantitative Perspective
Title | Discovery | Define | Design | Develop | Implement | Operate/Maintain |
Mission Portfolio Roles | ||||||
Sponsor, Mission SME | Mission | Mission | ||||
Data Mgmt Leadership Role | x | x | Mission | Mission | ||
Business Analyst | x | Tech | x | x | ||
Data Catalog Expert | x | x | x | |||
Data Portfolio Roles | ||||||
Data Suppliers | x | x | x | |||
Data Analyst | x | x | x | |||
Data Architect | Tech | x | Mission | |||
Data Scientist | x | x | x | |||
Data Developer | x | x | x | x | ||
Technology Portfolio Roles | ||||||
Data Solution Architect | x | Tech | Mission | |||
Platform/App Developer | x | x | Tech | |||
Data Wrangler | x | x | x | |||
Cloud DevOps | x | x | Tech | |||
Data Administrator | x | Data Tech | ||||
Cloud DevOps Administrator | x | x | IT Tech |
The Takeaway
So, following more of a quantitative perspective, the roles of Data Management Leadership, Business Analysts, and Data Developers arguably help the data lifecycle the most because they have roles in 4 out of 6 of the development/implementation processes we at Xentity tend to follow so closely. Furthermore, like in the previous graph, you see that certain roles are far more involved throughout the lifecycle. By contrast, following the idea success starts at the top, you could argue Data Management Leadership is the most helpful. Not only do we find it involved in four points in the development process, it has two leadership roles.
But, much like the results of the previous graph, your mileage may vary (your experience might be different). So, we encourage you to look at the first table once more. Using the graph the description of each role, decide for yourself which role helps the data lifecycle the most.
Each Role Matters
A lot goes into supporting the data lifecycle, from beginning to end. It could be your Mission Portfolio Roles (Which determines what kind of data is needed and what questions need to be asked). Or it Data Portfolio Roles (Provide ETL Pipelines and microservices). Or it could even be the Data Technology Experts (Providing data platforms and design environments for data). Regardless, you can make the argument that some roles help the data lifecycle more. Especially if you look at how often we use them from a quantitative perspective. But make no mistake: each role is useful in the life cycle.
At Xentity we love data and have often involved ourselves in various points in the data lifecycle because of that love for data. Check out our services page to learn more about how we offer our services for your data.