The Various Roles That Act as Both a Checkpoint and a Helper Throughout the Lifecycle
In the field of data analysis and consulting, there are a lot of different jobs that are found along what is known as the “data lifecycle. The data lifecycle is a representation of all of the stages of data throughout its life from its creation to its dissemination. Today, we’re here to discuss how various staff roles are utilized in a typical data lifecycle, and which are the most important roles.
The table below lists the major staffing roles that are found throughout the lifecycle, to include the job title, role type (mission portfolio, data portfolio and data technology expert), and a brief description of each particular role.
Mission Portfolio Roles
|Sponsor, Mission Subject Matter Expert (SME)||They know the questions they want to ask, what they do today, and what they want to do when next|
|Data Leadership Role||This is the Representative of the Data Organization once the organization is made into a formal division or function. This includes responsibility over governance, policy, and data management leadership. Depending on maturity, may be a Chief Data officer, Data Management Lead, or Data Management Working Group Rotating Lead|
|Business Analyst||Supports the Sponsor in documenting and detailing the questions being asked. Helps identify data sources, SMEs and supporting material.|
|Data Catalog Expert||They know the data they want to use and general assessment of readiness and who to talk to. Understand the metadata that needs to be present in order to make data usable and user-friendly.|
Data Portfolio Roles
|Data Suppliers||They provide the data and know their source|
|Data Analyst||They provide the modeling analysis of the supply of the objects, fields, values, rules, and qualities need|
|Data Engineer||They provide the ELT Data Pipeline to maintain the pipeline and lake for the supplies They also automate the supply of the data.|
|Data Architect||Designs the data system, outlines the dataflow and defines and designs how and where the different roles utilize the system.|
|Data Scientist||They know how to translate via relational or AI/ML script the reference, temporal, financial, and/or geospatial for integrating data across supplies within the pipeline prior to being available to the enterprise. Understand methodology and have a toolchest of AI resources that can convert data into knowledge.|
|Data Developer||They can make the final end transformations to make the end product – a service, product, map, app, API, etc. They also automate the delivery of the data products.|
Data Technology Expert Roles
|Data Solution Architect||Supports the requirements definition and leads the solution design analysis and recommendation for various platforms, flows, technology stacks, and planning from MVP through features through Architecture qualities|
|Platform / App Developer||Develop end user facing tools for interaction with the data platform and/or create data services for accessing the data.|
|Data Wrangler –||Handles the likely large amounts of data quality issues working for the data developer and with Design roles to address data quality issues with the design solution assumptions and supplier. Usually requires strong transformation scripting skills.|
|Cloud DevOps –||Initially, they design the cloud environment and deployment needed to facilitate the supply, developer, and user flows to meet service level expectations. Infrastructure as Code.|
|Data Administrator –||They maintain data environment administration of the lake, cube, fabric, warehouse, etc. including performance and data lifecycle service management monitoring. Data platform or data system administrator.|
|Cloud DevOps Administrator –||They provide the managed cloud service environment security, IT Compliance, and service levels agreements.|
So What Roles Help Out The Different Parts of Data Lifecycle Management The Most?
In order to answer that question, first we need to understand that there are typically three key phases of the data lifecycle:
- Planning – Planning focuses on Customer Relationship Management for the Needs and Requirements Lifecycle, Cost Benefits Analysis of data assets, Source Data Acquisition and Quality Planning and Funding Planning.
- Production and Data Lifecycle Management – Process Evaluation and Improvements, establishing improved automation in data pipelines, ETL, ELT, and maintaining the end to end stewardship of the data (define, inventory/evaluate, obtain, access, maintain, use/evaluate, and archive). First and foremost, this group extracts source data, load and executes transforms to support delivery’s goals.
- Service Delivery – Create data derivatives for delivery (downloads, services, packages, composite products), architect and support service platforms APIs, manage infrastructure to be FAIR, improve discovery, encourage and enable community and collaboration.
To further assign responsibility to each major role within these three parts of the data lifecycle, consider the concept of RASCI (Responsible, Accountable, Supports, Consults, Informs). A RASCI is a responsibility assignment matrix that clarifies the responsibilities of a particular role during preparation and implementation. Furthermore, the RASCI table below groups each role based on responsibility (RA being primary, SCI being support-based). RA (If the table is filled with an R, an A or both) means the job falls into a leadership role while SCI (if the table is filled with either an S, C, I, or any sort of combination) means the job is more a knowledge/support role. You can look back at the beginning of this paragraph where we list the five parts of RASCI for reference.
The graph below the table further categorizes roles by which part of the data lifecycle it falls under
|Title||Planning||Production Operations||Delivery Services|
|Mission Portfolio Roles|
|Sponsor, Mission SME||RA|
|Data Mgmt Leadership Role||C|
|Data Catalog Expert||SCI|
|Data Portfolio Roles|
|Technology Portfolio Roles|
|Data Solution Architect||CI||RA|
|Cloud DevOps Administrator||S|
Note the four of the roles in the “mission portfolio section” are involved in the planning stage. Some would have the opinion that without roles being properly played to plan the lifecycle of data, the entire operation would fall apart. Next we have “data portfolio roles”, which are far more strewn about than the former “group” of roles. However, a good portion of them are involved in S, C, I or some combination of the three. Finally, we have “technology portfolio roles”, which are found more in production operations and delivery services. Also, a good portion of them are involved in S, C, and I, much like the aforementioned data portfolio roles.
Following the logic that the more parts of a lifecycle a role can help out in (quantity), it would be easy to say that Data Architects are the most important role in the data lifecycle because their responsibilities are spread across all three parts of the data lifecycle. Between this, and the paragraph just before the graph, it’s ultimately a matter of opinion, so look back at the first table, and decide for yourself who helps out the most.
Okay, But What Roles Help Out the Most During The Project’s ‘Data Lifecycle’?
We are now taking a look at the question of who helps out the most in the data lifecycle from a completely new perspective. At Xentity, we closely follow the concept of Discovery-Define-Design-Develop-Implement-Maintain. So, we will now re examine the roles once more with the same rules. In the table below, blue defines a leadership role and green represents a support role. In the section that follows we will examine another graph that questions whether you can say which role helps out the most from a quantitative perspective.
|Mission Portfolio Roles|
|Sponsor, Mission SME||Mission||Mission|
|Data Mgmt Leadership Role||x||x||Mission||Mission|
|Data Catalog Expert||x||x||x|
|Data Portfolio Roles|
|Technology Portfolio Roles|
|Data Solution Architect||x||Tech||Mission|
|Data Administrator||x||Data Tech|
|Cloud DevOps Administrator||x||x||IT Tech|
So, following more of a quantitative perspective the roles of Data Management Leadership, Business Analysts and Data Developers arguably help the data lifecycle the most because they have roles in 4 out of 6 of the development/implementation process we at Xentity tend to follow so closely. Furthermore, like in the previous graph, you see that there are certain roles that are far more involved throughout the lifecycle. By contrast, if you want to follow the idea that success starts at the top, you could argue (based on the table’s data alone) that Data Management Leadership is the most helpful because not only does it have four points in the development process where it is involved in, but it also has two leadership roles in the process.
But, much like the results of the previous graph, it becomes a question of: your mileage may vary (your experience might be different). So, we encourage you to look at the first table once more. Using the graph and reexamining the description of each role, decide for yourself which role you think helps the data lifecycle the most.
Each Role Matters
A lot actually goes into supporting the data lifecycle, from beginning to end. Whether it is your Mission Portfolio Roles (Which determines what kind of data is needed and what questions need to be asked), Data Portfolio Roles (Provide ETL Pipelines and microservices), or the Data Technology Experts (Providing data platforms and design environments for data), you can make the argument that some roles help the data lifecycle more. Especially if you look at how often they can be used from a quantitative perspective. But make no mistake: each role is useful in the life cycle.
At Xentity we love data, and have often involved ourselves in various points in the data lifecycle because of that love for data. Check out our services page to learn more about how we offer our services for your data.