This article shares the views by Parag Arora, our community member since Dec-2019. Parag is a serial tech entrepreneur, ex YC alumni, ex IIT and has scaled up many products and businesses in his 12 years of career. In his conversation with our expert engagement lead Megha, he shares the best engineering practices that should be followed by an ace programmer.
Revision Control System
Parag believes that the most important factory to identify a good engineering is if their codebase is tracked in a good revision system such as Git , Subversion etc. He further adds , “Any codebase in a single repo should be independently deployable and for multiple codebases in a particular system, part of a distributed system where every codebase represent an app or a service. There should be a single codebase for every app and every deployment should run a single instance of that particular codebase or app. “
The next big idea Parag shared was that any dependency should not be part of a code but represent a separate entity or package outside . And the code should be able to include the same as well. He says, ” Many programming language offer tools for this e.g Ruby’s Gem file, Python’s Pip etc. which are nothing but dependency modules. So whenever you are using a language or a stack, make sure that the dependencies are maintained separately and they dont become part of your code inside. Any reusable modules e.g. email should be a separate module or dependency which are reusable across different apps.”
Any variables or configurations which represent different environments should be a part of an environment variable and not hard coded inside the code. e.g. production might point to a different database than deployment. These dependencies have a variable or constant which should only be defined as a system variable or constant and always kept as environment variables.
CI / CD Practices
CI or CD meaning Continuous Integration or Continuous Deployment should always be implemented as part of a Build Release / Run cycle. Parag adds , ” Whenever any developer commits a code, he / she cant do so without existence of an automated test function . These are basically functions which automatically test the code base. Basically there should be an automated testing that should not break the things that are already working. Again, deployment should be also automated such that when your processes are deploying a system, you should test all the components of the system automated, and deploy picking the right environment configurations.”
Parag says, “All processes or applications should themselves be stateless” He adds, ” Suppose you are deploying an application on a server and the load is let’s say 10,000 concurrent connections. And now we decide that we have it to scale to 50K , then the app should be written in a way that if there are multiple instances, they should definitely scale linearly or non linearly. The linear scaling is called Horizontal scaling and the characteristics of statelessness is that all processes or apps should be horizontally scalable.
Scaling processes should be as simple as running multiple instances or processes inside one instance / multiple instance. Also these processes themselves should be able to utilize resources of the machine. Further they should even spin multiple process or thread inside the app or single process . This will enable the system to scale in a sort of a way where you have to code much but keep changing configuration.
Similarity in Dev / Prod environment
Keep development or production environment / staging environment as similar as possible so that when you are deploying from staging to production, there are no leftover test cases or edge cases. Also, in case there is a difference in Dev / Prod environment , always have a 3rd party vendor in staging and production processes . However he strongly feels that as far as possible , one should always try not to have differences over different environment.
Log Management System
When probing over the “logs” , he says , ” Logs should always be considered as a different event or a stream. Application should keep pushing these events or event streams to a file or a separate system. They should not be modifying the routing or storage of the output streams but written in a way that they follow standard logging processes.” Also there should be a definite clarity in log characterisation whether it is an info, debug or a warning. And the processes associated with them should be warnings -> alert in login system , info -> better traceability of an issue and debug -> action for the programmer . Basically he feels that the logs should be written in a way that there is a CTA assigned towards each along with right time stamps , entities and structure thus rendering them queryable. In case they are small, they should use an ELK stack and in case of large logs, big data processing needs to be applied
In the end , Parag adds , ” All the infrastructure , deployment or administrative processes should be taken as a one of process. They should run from a single separate instance which has an identical environment. e.g. running a migration, one time script, or even if logging inside a machine from a console. they should all be run from a single environment only.