Reliable, Scalable and Maintainable Applications🔗

Many applications these days are data-intensive as opposed to compute-intensive
A data-intensive application is typically built around standard building blocks providing common needed functionality like
- Store data so that they, or another application can find it later (databases)
- Remember the result of expensive operation to speed up reads (caches)
- allow users to search data by keyword or filter in various ways (search indexes)
- send a message to another process, to be handled asynchronously (stream processing)
- periodically crunch a large amount of accumulated data (batch processing)
If above building blocks sound obvious because these data systems are successful abstraction.

Thinking about Data Systems🔗

we typically think of databases, queue, caches as different tools because of their different access pattern and use cases, but ultimately all store data for some time
With the onset of tools like Apache Kafka(message queue with database durability), Redis(datastore and message queue) this boundary is further blurred
Many Applications these days has wide range of requirements which a single tool can’t deliver. Usually work is broken in tasks to processed on a single data sytems efficiently which work in harmony with application code to process large volumes of data.
Designing any new data system entails following problems to be solved
- Ensuring data remains correct and complete, even if things go wrong internally
- providing good performance to client when some part of system are degraded
- how do you scale to handle increase in load
- what does a good API for service look like
External factors that do affect design process
- Skills and experience of people involved
- legacy system depencies and technical debt
- timescale of delievery
- organisation’s tolerance for risk and regulatory constraints