How to Improve Your Monolith Before Transitioning to Microservices
- source: https://news.ycombinator.com/item?id=32000598 July 06, 2022
A lot of those steps just seem like good engineering. (I personally prefer modular monoliths over microservices though, in all but very few cases.)
Agreed. I always wonder why people think that their inability to write libraries with good modularization will be solved by introducing microservices. It takes experience and guts to know when to use what and most people just go with the the latest and fanciest. Well tested, focused and self contained libraries are good architecture even when micro-services are a must.
Does each microservice really need its own database? Yes
If they won’t have it then they’re not microservices.
The main premise is independent deployability. You need to be able to work on microservice independently of the rest, deploy it independently, it has to support partial rollouts (ie. half of replicas on version X and half on version Y), rollbacks including partial rollbacks etc.
You could stretch it in some kind of quasimodo to have separate schemas within single database for each microservice where each would be responsible for managing migrations of that schema and you’d employ some kind of policy of isolation. You pretty much wouldn’t be able to use anything from other schemas as that would almost always violate those principles making the whole thing just unnecessary complexity at best. Overall it would be a stretch and a weird one.
Of course it implies that before simple few liners in sql with transaction isolation/atomicity now become phd-level-like, complex, distributed problems to solve with sagas, two phase commits, do+undo actions, complex error handling because comms can break at arbitrary places, performance cam be a problem, ordering of events, you don’t have immediate consistency anymore, you have to switch to eventual consistency, very likely have to do some form of event sourcing, duplicate data in multiple places, think about forward and backward compatibility a lot ie. on event schema, taking care of apis and their compatibility contracts, choosing well orchestration vs choreography etc.
You want to employ those kind of techniques not for fun but because you simply have to, you have no other choice - ie. you have hundreds or thousands of developers, scale at hundreds or thousands of servers etc.
It’s also worth mentioning that you can have independent deployability with services/platforms as well - if they’re conceptually distinct and have relatively low api surface, they are potentially extractable, you can form dedicated team around them etc.
Each service needs its own isolated place to store data. This programming and integration layer concern is very important. What’s less important is having those data stores physically isolated from each other, which becomes a performance and cost concern. If your database has the ability to isolate schemas / namespaces then you can share the physical DB as long as the data is only used by a single service. I’ve seen a lot of microservices laid out with different write/read side concerns. These are often due to scaling concerns, as read-side and write-side often have very different scaling needs. This causes data coupling between these two services, but they together form the facade of a single purpose service like any single microservices for outside parties.
Additionally, you can probably get by having low criticality reports fed through direct DB access as well. If you can afford to have them broken after an update for a time, it’s probably easier than needing to run queries through the API.
People forget, micro services commonly serve massive tech companies. With 100 developers working on the same product, you need it broken up and separated. If you’re in a small company, the value proposition is not as great. It’s contextual, and a tech stack that solves organizational problems, not always technical ones.
At FB we introduced service boundaries for technical reasons, like needing a different SKU. Everything that could go into the giant, well-maintained repo/monolith did, because distributed systems problems start at “the hardest fucking thing ever” and go up from there.