Choosing a Modern Data Platform
Seeking a modern data platform to solve all your data needs? Start with the end in mind:
· What you are trying to accomplish
· End User Persona and Skills
· Platform interactions that your users will have now vs the future
How much do you really know at this point in time about what you want and need? Do your team have the skills that are needed to guide this selection and an understanding of the points above?
How did we get here?
Companies often grow organically, either by acquisitions or created collaboration across different business lines, departments, or geographies to grow their clients market share or working to streamline information flow. It is no surprise that most of us have a number of data solutions, ranging from complex excel files, databases, data lakes, operational data hubs, data warehouses to multiple data copies across various applications and products.
When growing organically, it is normal that there are different levels of data skills that exist and approaches taken which can range from manual collection of data into spreadsheets with analysis being done on the spreadsheet to scripts written to extract and build a central integrated data mart with different reports built on top of it.
Data and information can be addictive, the more you discover, the more gaps you find and more questions to answer. If you are good at data, you are taking actions from these insights and thus generating even more data. It’s a virtuous cycle and you will always have data gaps and quality issues as you go on this journey.
Whilst we focus a lot on all the things we need to do, a good data strategy is also laser focused on what we don’t have to address right now and thus not over-thinking it. After all, in 5 years, the technology would have evolved again.
Some basics we can agree on to get started
It is common for teams to start seeking out one or a range of products to get started whilst trying to understand what they are trying to achieve. Whilst there is a lot to unpack here, there are some common basic needs that are non-negotiable when seeking a modern data platform:
1. Think Cloud First!
2. Lead with speed and timeliness:
Data in its raw form needs to be made available at speed and ease.
Ingestion patterns should support event-based triggers / real-time but be clear on the actual real-time standards required to support your use cases.
3. Demonstrate Flexibility:
Platform should allow users to upload their data – don’t depend on central teams alone to bring data in.
Anticipate the need for operational and analytical layers and the required data standardizing/mastering/controls in your modern data platform, ideally without building too many data copies.
Unified API access for consumers to get to your data.
Support different languages and a rich set of functionalities that you can turn on and off.
4. Personalize the Experience:
Build your data platform as a welcoming product for different user personas.
With varying data skills.
Who is using the platform in different ways, i.e. analytics, services or products.
Visualization and user experience is key.
Market your welcoming product:
Publish your data architectures and documentation as you go and ensure this information is easily retrievable, simple to understand.
Create the right support structure for questions (chat, email, online training, etc).
Build that welcoming journey from introduction of the platform to getting access to confidently running their first query and ensure you monitor these closely.
Active metadata and data catalogue to navigate your platform.
5. Build transparency around risks and manage it:
Considerations of data privacy, confidentiality, security and other regulations
Need for quality, standardization without the analysis-paralysis trap
6. Effortless automated learning behind the scenes about your platform, data and users:
Data entered must be catalogued with proactive monitoring on usage.
Build the development and operations (DevOps) standards and culture.
Are you ready to build all of these?
These are non-negotiable but not really the core Intellectual property (IP) of what most businesses will be differentiating themselves on (unless you are building out a data platform to sell to others). For most companies, whilst getting the above is fundamental, their real intellectual property comes from introducing new types of insight for their clients and customers within their industry verticals, either from amalgamation of different data sets or data acquired from their business operations. These types of high-value internal data are often buried in unstructured documents that are scattered throughout a company. These are real considerations to think about when you decide what data platform is fit for your company.
Different products can give you greater flexibility, less lock-in but it may be so open that the out of the box solution will not give you the things described above without having to invest the time to build them. Note that technology is also rapidly changing so speed to market on what differentiates you should be a key driver in making the decision.
Companies often have data teams in different parts of the organizations. You really need a cohesive commercial architecture based on your needs and end user personas. It is worth also considering the dynamics of your build teams when determining this solution, especially as you may also have legacy solutions which may or may not be a modern data platform candidate.
Teams often start by trying to migrate all their legacy solutions over which takes a lot of bandwidth and time. Whilst cost savings is often attractive in the short term, the magnitude of this initiative if successful is far higher than this being just a full migration activity and shutting down the older platforms. Be selective on what you migrate, focus on the strategic commercial things. If built as a welcoming product, others will migrate over themselves when the system becomes mature.
What’s next?
There are additional items to consider which I will describe in my next article, but my guide would be to ensure that you have the investment thesis locked down before you go much further. Kin + Carta in their 2020 Digital Report describe the need for organizations to think story vs storage so it is important to understand the investment thesis I described in my previous article before going too far into other technical components and architecture.
If you are evaluating solutions that are out there, avoid the power point presentations and ask for a live demo of the platform with 3 different relevant data sources and personas. During the demo, walk-through the standard process these personas will undertake and see the off the shelf monitoring (or with minimal configuration) that comes out of the box. This will allow you to evaluate the capability against the points I have listed above. Importantly, this process will also start to highlight limitations you may have within your organizations in terms of people, process and legacy platforms that you will need to plan for.
Call me if you would like to discuss further. In my next article, I will share some evaluation templates to help with your selection and share some lesson learned from attempting this. I will also share some success metrics that you can build into your plan to accelerate your successful data journey!