Quick Steps for a Scrum Team to Improve the Process

Make your stand-ups short and concise, focus on value added

Copied this article word for word by Egor Savochkin, this is a better written article than i could have done. It hits all the high points from my own notes especially with developers in different time zones

https://medium.com/booking-com-development/quick-steps-for-a-scrum-team-to-improve-the-process-11c0c53b0adc


Scrum is a framework that gives teams the freedom to build their own processes. Unfortunately, teams often create complex, bloated processes that eat up all their time and leave little room for value-added or improvement work.

How often do your daily stand-ups stretch past 30 minutes? Last year, did you find time to work on process or technical improvements? As an Engineering Manager (EM), do you still find time to code or do code reviews? As a Product Manager (PM), do you always know what the team is up to?

If any of those answers are “no”, it may mean that management and coordination are using up most of your time. Here are some simple steps to streamline your Scrum processes and make them lighter.

Step 1. Walk the board.
The Scrum Guide says the Daily Scrum’s purpose is to inspect progress toward the Sprint Goal and adapt the Sprint Backlog [SG2020]. Specific techniques aren’t mandated. But many teams default to the “three questions” format: What did I do yesterday? What will I do today? Are there any blockers? This approach is simple, but often inefficient for larger teams.

Here’s how the Daily Scrum (or stand-up) often goes.

  • 00:00 The team joins the Zoom call.
  • 00:01 “Let’s wait a couple of minutes for everyone to join.”
  • 00:03 Dev 1: “Okay, I’ll start. Yesterday, I worked on ticket XYZ-123. I implemented the backend logic for the new feature, added unit tests, and started working on integration tests. I ran into an issue with the API returning unexpected results, so I had to debug that for about an hour. Oh, and I attended two meetings. One with the designer about the new UX mockups. Another with a developer from another team to clarify requirements for the API integration. Btw, I am done with XYZ-123. So let me close it. Oh, I need to fill in the mandatory fields. Wait a bit… (Pause). Okay, done. Today, I plan to continue working on XYZ-456. I plan to complete the integration tests and attend the retrospective later this afternoon. I’ll also be looking into a bug that QA raised yesterday evening…”
  • 00:08 Dev 2: “Thanks. Yesterday, I mostly worked on the frontend for ticket ABC-123. I got the UI components aligned with the design specs and added validation for the input fields. I also reviewed a pull request from Developer 3 — good work, by the way! I spent some time investigating a flaky test in our CI pipeline, but I couldn’t figure out the root cause yet. Today, I plan to continue with ABC-123, wrap up the CI investigation, and attend the sprint planning session…”
  • 00:10 Dev 3: “Alright, my turn. Yesterday, I worked on ticket DEF-789. It was a bit tricky because there were some edge cases with the database queries, but I managed to optimise them. Now, I am stuck with the APIs. XXX, you worked on something similar recently — do you know how to fix this? (They are discussing the problem for 5 minutes). I also helped Developer 2 with a review for ticket XYZ-123 and attended the same design meeting. Oh, I spent about an hour updating our deployment docs. The old version was outdated. Today, I’ll start working on the new logging feature and address some of the comments on my open pull requests…”
  • 00:17 Dev 4: “Oops, looks like we’re running out of time. I’ll keep it quick. Yesterday, I…”

Well… a lot of details. The team is well out of time, but they still did not discuss all the tickets in progress. It is very difficult to digest all this information and have a clear action on how to push the work forward.

Sharing detailed updates is understandable — we’re proud of our work, and stopping can be hard. But without firm facilitation, these sessions balloon to 30 minutes or more, even for small teams.

Instead, skip the individual updates. Walk the board. Focus on the work items — not the war stories.

First, encourage team members to update ticket statuses before the stand-up to save time. If you have completed a ticket, move it to the Done column. You do not even need to discuss them during the stand-up. If it’s blocked, mark it with a flag (more on this later).

During the stand-up, walk the board backwards, starting from the most completed tickets to the least worked on. Why? Tickets closest to completion represent the most investment. Finishing them frees up mental capacity and resources for other tasks. Even if you run out of time, you will still touch the most important tickets.

For each ticket, the least the team needs to know is whether there are any blockers or issues. If there are, the team can stay on the call after the daily to address them or schedule a follow-up. Often, five minutes of team brainstorming can resolve issues that might otherwise cause hours of delay.

The level of detail can vary based on team size. For smaller teams of 4–5 members, it’s manageable to touch on every ticket and discuss any issues. For larger teams, the focus should shift to reviewing only blocked tickets or taking longer than expected.

Let the developers drive the stand-up. You do not need a PM or EM to do this. We have a rotating support schedule in the team. And the developer responsible for the support also runs the stand-up. This makes the team more self-sufficient and offloads some burden from the EM.

Here is an example of a Walk-the-Board stand-up.

  • 00:00 The team joins the Zoom call.
  • 00:00 Facilitator: “Good morning, everyone! Let’s walk the board. Remember to keep updates concise and focus on blockers or anything that needs team input. XXX hasn’t joined yet, but let’s begin — I hope he’ll join soon.”
  • 00:01 Facilitator: “I see that someone moved Ticket XYZ-123 to the Done column. Great!” 00:01 Facilitator: “Let’s start with the In Progress column.”
  • 00:02 Facilitator: “Ticket XYZ-456” Developer 1: “I’m making good progress. No issues. No blockers.”
  • 00:03 Facilitator: “Ticket DEF-789” Developer 3: “I’ve started this bug, but I’m stuck on the API integration. XXX, you worked on something similar recently — can we discuss it after the call?” Developer 1: “Sure, I’ll help you after the stand-up.”
  • 00:04 Facilitator: “Ticket ABC-123” Developer 2: “I’m making good progress. No issues. No blockers.”
  • 00:05 Facilitator: “Ticket GHI-101: Bug fix (flagged as blocked).” Dev 4: “This is still blocked because I’m waiting for the infrastructure team to answer my question. I pinged them yesterday, and I’ll follow up again this morning.” Facilitator: “If you don’t hear back by noon, let’s escalate it.”
  • 00:05 Facilitator: “To Do Column: Ticket JKL-202. This ticket is next up. Any concerns before someone picks it up?” Dev 2: “None from me. I’ll take this after finishing ABC-123.” Facilitator: “Sounds good.”
  • 00:06 Wrap-Up Facilitator: “That’s everything on the board. Let’s stay on for a few minutes to discuss the API issue. Anything else?” Team: Silence. Facilitator: “Alright, let’s get to work!”

Effective stand-ups need discipline. While team members may feel proud of their work or want to share more details, they should focus on concise, relevant updates. With practice, the team can strike the right balance and keep stand-ups efficient.

Step 2. Fight the blockers.
Remember when the highway is least efficient? That’s right — during a traffic jam. The throughput is almost zero, and the highway barely delivers any value. Even early in the morning, when a few cars are whistling down the road, the throughput might be higher (see picture below).

The graph shows the throughput (flow) of a highway during a typical American morning. The throughput increases steadily from 6:50 a.m., peaking at approximately 7:15 a.m., after which it sharply declines as a traffic jam forms. The minimum throughput is observed between 7:20 and 7:40 a.m.
Pic 1. The graph shows the throughput (flow) of a highway during a typical American morning. The throughput increases steadily from 6:50 a.m., peaking at approximately 7:15 a.m., after which it sharply declines as a traffic jam forms. The minimum throughput is observed between 7:20 and 7:40 a.m. (source)

One of the main focuses of traffic control officers is to improve the flow. Any jam should be fixed as quickly as possible; it makes no sense to keep them.

Every ticket that is not actively worked on is like a car in a traffic jam.

Make it a priority for the whole team to resolve blockers. First, ensure it is crystal clear when a ticket is blocked. In JIRA, the recommended way to do this is by marking it with a flag.

Creating a special “Blocked” column on the board is an option, but it’s not the best choice. Using a flag is simpler, more flexible, and visually clearer. It lets you mark tickets as blocked without needing to change their status, no matter where they are — backlog, to-do, or in progress.

The flag indicates which tickets are not being worked on. If you click on the ticket, you can see the comment with the reason why someone has blocked it.
Pic 2. The flag indicates which tickets are not being worked on. If you click on the ticket, you can see the comment with the reason why someone has blocked it.
The flag means that no one can work on the ticket due to various reasons, such as:

  • Waiting for a response from another team.
  • Blocked by a more important ticket.
  • Blocked by the monolith rollout.
  • A team member is ill.
  • Blocked by the deployment freeze.
  • Providing comments on blocked tickets is crucial. The reason for blocking should be clear.

The team’s board should reflect the real-time status of work items. Encourage team members to keep the status of the ticket updated. If you cannot work on a ticket any more, block it. If the ticket gets unblocked, remove the flag. Having such an agreement saves time on status updates and allows the whole team to see problems and act on them early.

During dailies, challenge the blockers and think of creative ways to unblock a ticket.

Watch out for the hidden blockers. If a ticket is taking too long to complete, it’s a good candidate for discussion in the daily stand-up or a quick brainstorming session. In JIRA, you can enable the “days in column” option and see the tickets taking longer than expected.

Track the statistics of blockers and discuss them during retrospectives. This can bring useful ideas for future improvements.

Step 3. Focus on value versus activities.
We’re all here for one reason: to deliver value to our customers and stakeholders. In Scrum, that value is defined at the Product Backlog Item (PBI) level. Every sprint, we select a few PBIs to be part of the Sprint Backlog, along with an actionable plan for delivering the Increment [ScrumGuide2020].

But often, Scrum teams mix value-added tickets with activity tickets. Often, teams leave small PBIs as is but break down bigger ones into implementation tasks. Add on top of this time-boxed tickets such as support or research activities. Even with a small team of 3–4 developers, this becomes an overcomplicated mess soon. And when the team grows? Well, it turns into a management nightmare where everyone spends most of their time trying to figure out what is going on.

So, what’s the fix? Track only value-added tickets. Avoid activity tickets as much as possible. It makes the scope clearer, more transparent, and gives you metrics that actually mean something.

First, break down value-added tickets as much as possible by acceptance criteria. If a ticket takes several days to complete, don’t overcomplicate it by breaking it into even smaller activity tasks.

Second, ditch the time-boxed tickets. You shouldn’t have a PBI that doesn’t fit the sprint, right? Unfortunately, this happens all the time, and teams end up cloning the same ticket over and over. This messes with your stats and makes it hard to track progress. The solution again? Break the ticket down by acceptance criteria.

If an epic or story doesn’t fit in one sprint, the team should split it. Time-boxed tickets (on the left) don’t give you clear progress. Instead, split the PBI by acceptance criteria (like the example on the right). This makes everything much clearer.
Pic 3. If an epic or story doesn’t fit in one sprint, the team should split it. Time-boxed tickets (on the left) don’t give you clear progress. Instead, split the PBI by acceptance criteria (like the example on the right). This makes everything much clearer.
Avoid tickets like “Support activities for week ##.” These tickets don’t add much value and clutter the board. Suppose you have five incidents or defects from support that week. It’s better to list each one as a separate ticket rather than as comments under a vague support activity.

Finally, steer clear of Increment-related activities like merging, building, integration testing, and deployments. Automate such tasks by making them part of the CI/CD pipeline. This way you will not need to create separate “activity” tickets.

A good definition of done (DoD) should assume that any change is ready for production. This aligns well with common sense and makes it easier for everyone to understand the status.

There are still cases when you might need activity tickets. For example, you might need a ticket for an end-to-end test spanning several teams. But this should be a rare exception.

Let’s look at an example. A team has been working for a month. Now, we need to review the completed tickets to prepare for a stakeholder meeting. The team developed a new service to enable partners to make payments online via a third-party payment service provider (PSP). They also worked on a few technical debt tickets and fixed two bugs from support.

We can list the results with focus on activities.

  • Task: Support week #27
  • Task: Support week #28
  • Task: Support week #29
  • Task: Support week #30
  • Task: Implement BE endpoint
  • Task: Integrate with PSP
  • Task: PSP payment page
  • Task: PSP payment testing
  • Task: Make the endpoint production ready [timeboxed] [part 1]
  • Task: Make the endpoint production ready [timeboxed] [part 2]
  • Task: Make the endpoint production ready [timeboxed] [part 3]
  • Bug: Partner ## cannot open page x

The list of tickets tells us what the team worked on, but it doesn’t highlight the value added.

We can see the team spent four weeks on support activities. The second bug isn’t listed, so it was likely added as a comment on a support activity ticket.

The team worked on the backend endpoint, the PSP page, the PSP integration, and testing. But it’s unclear what capabilities were implemented or if the team shipped them to the customer.

The team also spent three weeks making the endpoint production-ready. Again, it’s unclear what specific features the team implemented.

We can list the results with focus on value added.

  • Bug: Partner ## cannot open page x
  • Bug: Partner ## unable to proceed with the payment error
  • Story: Partners pay single invoice via credit cards in the Netherlands
  • Story: Partners pay single invoice via WeChat in China
  • Task: Payment results monitoring internal dashboard
  • Task: Log payment errors

Customer service can see the team has fixed two reported bugs. Stakeholders know that partners can now pay single invoices via credit cards in the Netherlands and WeChat in China. Operations can see that the monitoring dashboard and logs are now ready for use.

The work is the same but the representation in the latter case is much clearer.

Conclusion
Improving your Scrum process doesn’t have to be complicated. Small, simple changes can make a big difference. Try walking the board during standups, focusing on clearing blockers, and keeping tickets centered on value. Scrum works best when it’s clear and efficient — not overloaded with complexity. Stay focused on delivering value, work together as a team, and keep improving step by step. Start small, adapt as you go, and see the positive results for yourself.

OSX Tree

The tree command shows the folder structure of the current directory. Use the -d option to only show directories. The -I option lets you exclude certain directories. For example:

tree -I node_modules

To exclude multiple directories, separate their names with |:

tree -I 'node_modules|cache|notes|test*'

his will hide node_modules, cache, notes, and any directories starting with test from the output.

tree` is installed with brew

brew install tree

Web hygiene

In the fast-evolving world of technology, managing web content effectively is often a complex, multifaceted task. As a tech lead for a large international technology company, I’ve come to recognize the importance of not just keeping up with technological advancements, but also maintaining a standard I like to call “web hygiene.” While not an official industry term, web hygiene captures an essential, holistic approach to how we create, maintain, and protect our web presence.

DALL·E

What is Web Hygiene?

Web hygiene is a practice that embodies the principles and actions necessary to maintain a healthy, compliant, and user-centric online environment. It’s not just about flashy features or impressive performance metrics—it’s about ensuring that every part of our web content adheres to certain standards that safeguard users, support accessibility, and uphold our company’s reputation.

The term itself, “hygiene,” implies cleanliness and routine care—much like personal hygiene is a baseline for health and social interactions, web hygiene is the baseline for maintaining a trustworthy, robust, and inclusive web presence.

The Core Pillars of Web Hygiene

Web hygiene is not just one aspect of web management; it’s a cohesive practice that spans several critical areas:

1. Accessibility

An inclusive web experience is not just a compliance checkbox; it’s a commitment to enabling all users to engage with our content. Meeting WCAG standards ensures that people with disabilities can navigate, read, and interact with our sites seamlessly. Proper web hygiene integrates accessibility into every stage of design and development, making it part of the DNA of our content creation.

2. Data Collection and PII (Personally Identifiable Information)

Data is a powerful asset, but with great power comes great responsibility. Web hygiene means that any data we collect adheres strictly to privacy laws and ethical standards. This involves implementing transparent consent mechanisms, anonymizing data wherever possible, and maintaining a secure infrastructure to protect against breaches. The trust we build through responsible data collection cannot be understated—it’s what sets leading technology companies apart from those who cut corners.

3. Maintenance and Upkeep

Web hygiene demands regular maintenance. This includes routine security audits, updates, and testing to ensure that we’re proactively closing vulnerabilities and optimizing performance. The idea here is to prevent problems before they arise—keeping the web infrastructure as healthy as possible to avoid breakdowns that could compromise user experience and trust.

4. Content Quality

Even the most secure, accessible, and well-maintained site can fall flat if the content is subpar. Good web hygiene includes having clear guidelines for writing effective, relevant, and engaging content. This means avoiding jargon, staying concise, and keeping the end user in mind at all times. Content should be easy to read, informative, and updated as needed to reflect current information and practices.

Training and Access Control

To uphold web hygiene, it’s crucial that all web content creators and site owners undergo an access course or training before being given the keys to their own subdomain or access to an existing one. This training ensures they understand the fundamentals of web hygiene, including accessibility, data handling, and content standards. This process will be tracked, and refresher sessions will be offered to maintain their access over time, reinforcing a culture of continuous learning and adherence to best practices.

Automated Tools and Monitoring

Web hygiene extends beyond human practices. Automated tools can be employed to scan websites and provide feedback on compliance with accessibility, security, and content quality. These tools serve as a proactive measure, highlighting potential issues before they escalate and ensuring that web hygiene standards are consistently met.

Why Web Hygiene Matters

You might wonder, why not just let creators and site owners do what they please? Isn’t creativity key? While creative freedom is important, in a large-scale organization, it must coexist with responsible practices. Poor web hygiene leads to a host of issues—ranging from accessibility complaints and data privacy violations to decreased trust and engagement. In an environment where a single oversight can cascade into serious reputational damage, maintaining web hygiene isn’t just good practice—it’s a business imperative.

Taking a Holistic Approach

Web hygiene isn’t a one-time checklist or a static policy; it’s a dynamic, ongoing process that involves cross-functional teams, including developers, content strategists, legal advisors, and security experts. It requires a shared understanding and commitment across the organization to do what’s right—even when no one is watching.

It’s about creating a culture where web hygiene is as natural as locking the doors when you leave a building. When every team member understands their role in maintaining these standards, we foster a healthier web presence that’s not just compliant, but also resilient and respected.

In the end, web hygiene represents a commitment to quality, integrity, and inclusivity that benefits both the company and its users. It’s about setting the bar higher and making sure that we’re doing more than the bare minimum—we’re upholding the values that make our digital spaces trustworthy and sustainable for all.

I’m learning Python!

Time to learn a new skill, here is my cheat sheet

JavaScript/TypeScript vs Python Crib Sheet

1. Variables and Data Types

Concept JavaScript / TypeScript Example Python Example
Declare variable (let, const) let x = 5; const y = 10; x = 5
Declare object let obj = { name: "John" } obj = { "name": "John" }
Declare array let arr = [1, 2, 3]; arr = [1, 2, 3]
Array of objects let arr = [{name: "A"}, {name: "B"}]; arr = [{"name": "A"}, {"name": "B"}]
Null null None
Undefined undefined None (used in place of undefined)
True/False true, false True, False

2. Objects vs Dictionaries

Concept JavaScript / TypeScript Python
Create object let obj = { key: "value" } obj = { "key": "value" }
Access object value obj.key or obj["key"] obj["key"]
Add/update key obj.newKey = "newValue" obj["newKey"] = "newValue"
Delete key delete obj.key del obj["key"]

3. Arrays vs Lists

Concept JavaScript / TypeScript Python
Create array/list let arr = [1, 2, 3]; arr = [1, 2, 3]
Access element arr[0] arr[0]
Add element arr.push(4) arr.append(4)
Remove element arr.pop() arr.pop()
Array length arr.length len(arr)
Loop through array arr.forEach(item => console.log(item)) for item in arr: print(item)

4. Spread Operator / Unpacking

Concept JavaScript / TypeScript Python
Spread array let newArr = [...arr, 4]; new_arr = [*arr, 4]
Spread object let newObj = {...obj, newKey: "new"} new_obj = {**obj, "newKey": "new"}
Spread function args function(...args) {} def function(*args):

5. Conditionals and Null/True Operations

Concept JavaScript / TypeScript Python
If statement if (condition) {} if condition:
Ternary let val = condition ? trueVal : falseVal val = trueVal if condition else falseVal
Nullish coalescing operator let val = x ?? "default"; val = x or "default"
Optional chaining let val = obj?.key; val = obj.get("key") or val = obj["key"] if obj else None

6. Loops

Concept JavaScript / TypeScript Python
For loop for (let i = 0; i < arr.length; i++) {} for i in range(len(arr)):
For…of loop (array) for (let item of arr) {} for item in arr:
For…in loop (object keys) for (let key in obj) {} for key in obj:
While loop while (condition) {} while condition:

7. Functions

Concept JavaScript / TypeScript Python
Declare function function myFunc() {} def my_func():
Arrow function const myFunc = () => {} N/A (use def my_func():)
Return return value; return value
Default parameter function myFunc(a = 10) {} def my_func(a=10):
Rest parameters function(...args) {} def function(*args):

8. Classes and Objects

Concept JavaScript / TypeScript Python
Declare class class MyClass {} class MyClass:
Constructor constructor() def __init__(self):
Instance method this.myMethod() self.my_method()
Create object let obj = new MyClass(); obj = MyClass()

9. Error Handling

Concept JavaScript / TypeScript Python
Try/Catch try { ... } catch (e) { ... } try: ... except Exception as e:

Key Differences to Keep in Mind:

  1. None vs. null/undefined: In Python, None represents both null and undefined in JavaScript. There’s no separate undefined in Python.
  2. Indentation matters: Python uses indentation to define blocks of code (no curly braces).
  3. No semi-colons: Python doesn’t require semi-colons at the end of statements.
  4. True/False: Use True and False in Python (capitalized), not true and false as in JavaScript.
  5. Method definition: When defining a class method, you need to include self as the first argument in Python, which refers to the instance of the class.

Enhancing Focus with Thematic Sprints in Our Dynamic Development Team

Created by DALL·E

In the fast-paced world of software development, maintaining focus is crucial, especially for our dynamic team, which consists of a core of four developers but can expand to about a dozen. We’ve found that implementing thematic sprints—where each sprint is named after a specific theme or goal—significantly enhances our productivity and engagement

When our team is small, usually capped at six members, we experience a focused output with active involvement from developers in the planning phase. However, we’ve noticed that stories can grow in complexity, leading to a loss of focus and extended timelines. To address this, we started naming our sprints with clear themes, creating a shared understanding of our objectives.

This simple practice reinforces our commitment during daily stand-up meetings, where we consistently reference the sprint theme. By framing our work around these themes, developers stay aligned with our goals and feel a greater sense of ownership.

Moreover, thematic sprints foster collaboration. As we engage with a specific focus, discussions naturally emerge, leading to innovative solutions and richer features. For instance, dedicating a sprint to user experience made our team more attentive to feedback, resulting in improved designs.

In conclusion, thematic sprints have transformed our development process, enhancing focus and creativity while nurturing a culture of collaboration and innovation. This approach has proven invaluable in our journey as a dynamic software development team.

NPM Dependency Notation – idiots guide

Image from ChatGPT

Dependency notation in the package.json file influences how npm handles version installations, and this affects whether npm install may update the package-lock.json file.

Here’s how it works based on different notations:

Version Notations and Their Meanings:

  1. Exact Version (“4.19.2”):
  • Notation: “4.19.2”
  • Meaning: Install exactly version 4.19.2 of the package.
  • Effect on npm install:
  • When this version is specified, npm will always install version 4.19.2, regardless of whether newer minor or patch versions are available. The package-lock.json will not be updated unless you manually change the version in package.json.
  • Example: If version 4.19.3 is available, npm will not install it.


2. Caret (^) Notation (“^4.19.2”):

  • Notation: “^4.19.2”
  • Meaning: Install any compatible version according to semver rules, meaning any version >=4.19.2 and <5.0.0.
  • Effect on npm install:
  • With ^, npm allows updates to the minor and patch versions, but not the major version. This means if a newer patch version (like 4.19.3) or minor version (like 4.20.0) is available, npm will install it. If an update is installed, the package-lock.json will be updated to reflect the new version.
  • Example: If 4.20.1 is available, npm install will update package-lock.json to install 4.20.1.


3. Tilde (~) Notation (“~4.19.2”):

  • Notation: “~4.19.2”
  • Meaning: Install the most recent patch version that matches the specified minor version, meaning any version >=4.19.2 and <4.20.0.
  • Effect on npm install:
  • With ~, npm will allow updates to the patch version but not the minor version. If a newer patch version is available (e.g., 4.19.3), npm will install it, and the package-lock.json will be updated.
  • Example: If version 4.19.5 is available, npm will install that, but it will not install version 4.20.0.


4. Major Version (“4.19”):

  • Notation: “4.19”
  • Meaning: This implies “4.19.x”, which means install the latest available patch version within the 4.19.x range.
  • Effect on npm install:
  • This is similar to using ~4.19.0 but more permissive. It allows updates within the minor version and to any patch version (e.g., 4.19.2 → 4.19.5).
  • Example: If version 4.19.3 or 4.19.5 is available, npm will install it and update package-lock.json.


Impact on npm install and package-lock.json:

  • Exact version (“4.19.2”): No updates will occur unless you change the version manually in package.json.
  • Caret (^4.19.2) or Major version (“4.19”): Newer patch or minor versions will be installed automatically, and this will update package-lock.json with the exact version.
  • Tilde (~4.19.2): Only patch updates are allowed, and package-lock.json will reflect those updates when they occur.


How This Affects npm install:


If you use npm install (as opposed to npm ci), and the notation in your package.json allows for updates (like ^4.19.2), npm may install a newer version within the range, and as a result, the package-lock.json file will be updated with this new version.

On the other hand, if the package-lock.json specifies a version (say 4.19.2), and your package.json allows updates (^4.19.2), running npm install could still install a newer version (like 4.19.3), which would then update the lock file.

This flexibility is a double-edged sword: it allows for automatic updates of patches and minor versions, but if not managed well, it can lead to differences in installed versions across environments, which is why many teams prefer using npm ci in CI/CD pipelines for consistency.

NPM install, ci and audit

Image created by ChatGPT

I needed to take a step back and fully understand this issue so I could explain it clearly to both new and experienced developers. The problem surfaced because our deployment pipelines run npm audit, which became a bottleneck in our process. We kept seeing the same vulnerabilities flagged repeatedly, even though they had been fixed multiple times.

Here we go, npm instal, ci and audit

Here’s an overview of the flow from installing a new package with npm to running npm install or npm ci in a pipeline, along with details on how vulnerabilities may resurface through npm audit.

Flow from Installing a Package to CI/CD

  1. Installing a Package:
  • When you install a new package locally (e.g., npm install package-name), npm adds the package to the node_modules directory and updates your package.json and package-lock.json files (or yarn.lock if you use Yarn).
  • package.json specifies the declared dependencies and their versions.
  • package-lock.json contains the exact versions of the installed packages and their entire dependency tree (including transitive dependencies). This ensures that everyone who installs your project gets the same versions of dependencies.

2. Pushing to Version Control:

  • Once you are satisfied with your code, including the new dependency, you push the changes to version control (e.g., Git). It’s important that both the package.json and package-lock.json files are committed to ensure consistency across environments.


3. Pipeline – npm install vs npm ci:

  • npm install:
    • During a build or deployment pipeline, running npm install will install dependencies based on the package.json and update the node_modules directory.
    • If a package-lock.json file exists, npm tries to install exact versions from the lock file, but if it detects any changes (e.g., new versions of dependencies or conflicts), it may update the lock file. (See dependency notation)
  • npm ci:
    • In a CI/CD pipeline, npm ci is preferred as it is faster and more deterministic.
    • It strictly adheres to the versions specified in package-lock.json. If any discrepancies (such as missing or extra dependencies) are found, the entire node_modules directory is deleted, and the exact dependencies from the package-lock.json are installed.
    • npm ci does not update package-lock.json, making it ideal for CI environments where reproducibility is critical.


4. npm audit:

  • During or after the install process, npm may run npm audit to check for security vulnerabilities in your dependencies. It compares the installed packages against a database of known vulnerabilities and flags any risks.
  • npm audit fix can automatically update vulnerable dependencies to the latest non-breaking versions (as defined by semver).


How Do npm audit Problems Reappear?

  1. Indirect Dependencies (Transitive Dependencies):
  • Most npm packages rely on other packages (dependencies of dependencies), and vulnerabilities often arise in these indirect dependencies.
  • Even if you’ve addressed an issue by updating your direct dependencies, some transitive dependencies may still have unresolved issues. This happens because they may not have yet released a fixed version.


2. New Vulnerabilities Discovered:

  • Sometimes, new vulnerabilities are discovered in packages that were previously considered safe. When npm’s vulnerability database is updated, a previously resolved issue may reappear if it’s related to a newly discovered flaw.


3. Out-of-Date Dependencies:

  • When the package-lock.json or a specific package hasn’t been updated for a while, and a vulnerability was later fixed in a newer version, your audit might flag the outdated dependency.
  • Running npm audit regularly (especially on pipelines) will catch such vulnerabilities, but sometimes an older transitive dependency may bring back the issue.


4. Partial Fixes:

  • Sometimes, packages release partial fixes, where only certain issues are resolved. If the fix doesn’t cover all security concerns, npm audit may still flag the package.


5. Conflicts Between Versions:

  • Certain updates may not be backward compatible with your project’s current environment or with other dependencies. This can lead to situations where you are unable to fully update vulnerable dependencies without breaking something else in your codebase.


Dealing with Persistent npm audit Problems:

  • Explicit Version Control: Sometimes you may have to manually control the versions in package-lock.json by using specific version ranges or resolutions (in tools like Yarn) to enforce the use of patched versions.
  • Selective Fixing: If you know a particular vulnerability doesn’t affect your project (e.g., it only impacts a feature you don’t use), you can audit it with exceptions.
  • Monitor Transitive Dependencies: Regularly check your dependency tree to monitor transitive dependencies and see if any have lagging versions. This can be done using tools like npm ls or through dependency-checking platforms.

Azure AppInsights with Nodejs adding operation id to response

I needed to add an additional response header to my requests to aid in tracking errors.

Here is my solution. A middleware that will add the AppInsights Operation ID to the response. The Operation ID is unique to each request so searching for it is simple.

import { Request, Response, NextFunction } from 'express';
import appInsights from '../../lib/appInsights';

export default (req: Request, res: Response, next: NextFunction) => {          
    const { operation } = appInsights.getCorrelationContext();    
    
    res.setHeader('X-Operation-ID', operation.id);
    
    next();
};

Development questions…

I’ve been working on compiling a set of questions we should ask at the outset of each new feature. In my experience, teams often become laser-focused on coding and implementing new features without considering the broader context. I believe it’s crucial to be proactive and demonstrate the value of our work to the company, as ultimately, the bottom line is what matters most.

For every new feature, let’s ensure we ask these questions. The responses we gather can then inform the creation of tickets and specifications, ensuring that we’re aligned with both technical requirements and organizational objectives.

AspectQuestions
Top Level User Story– What is the main objective or goal of this feature/change?
Security Concerns– Are there any potential security vulnerabilities or risks associated with this work?
Organizational Policies and Processes– Are there specific policies or processes that need to be adhered to during development?
Value Addition– How does this work enhance the application/product?
– What additional benefits or improvements does it bring?
Justification– Why is this work necessary or important?
– What problem or need does it address?
Proving Worth– How can we demonstrate the impact or value of this work?
– What criteria or metrics can be used to measure its success?
Monitoring– What metrics or indicators can be used to track the performance or usage of this feature/change?
– How will we monitor its effectiveness over time?
Technical Requirements– Are there any specific technical constraints or dependencies that need to be considered?
– What technologies or frameworks should be utilized for this work?
User Experience (UX)– How will this work impact the user experience?
– Are there any usability considerations to be aware of?
Testing and Quality Assurance– What testing strategies will be employed to ensure the quality of the implementation?
– Are there any specific test cases or scenarios that need to be addressed?
Scalability and Performance– How will this work scale as the application grows?
– Are there any performance considerations or benchmarks to meet?
Documentation– What documentation needs to be created or updated as part of this work?
– How will knowledge transfer be facilitated for other team members?
Deployment and Rollout– What is the deployment plan for this work?
– Are there any rollout or release strategies to consider?
Feedback and Iteration– How will feedback be collected and incorporated into future iterations?
– What mechanisms are in place for continuous improvement?
Collaboration and Communication– How will communication be maintained between team members and stakeholders throughout the process?
– Are there any collaboration tools or platforms to be used?
Risk Management– What potential risks or challenges could arise during implementation, and how will they be mitigated?
– Is there a contingency plan in place for unexpected issues?