How does a browser actually work

A Web browser is actually a software application that runs on your Internet-connected computer. It allows you to view Web pages, as well as use other content and technologies such as video, graphics files, and digital certificates, to name a few.

In this article, I’d like to focus on the Process of How Does a Browser Actually Work. Since most of your customers will interact with your web application through a browser, it’s imperative to understand the basics of these wonderful programs.

let’s take a closer look at the steps these ingenious applications do for us.

  1. Parsing
  2. DOM Tree
  3. Render Tree Construction
  4. Layout Computation
  5. Painting
  6. Compositing

Parsing :

HTML parsing involves tokenization and tree construction So we have HTML content at the beginning which goes through a process called tokenization, tokenization is a common process in almost every programming language where code is split into several tokens which are easier to understand while parsing. This is where the HTML's parser understands which is the start and which is the end of the tag, which tag it is, and what is inside the tag.Now we know, the HTML tag starts at the top and then the head tag starts before the HTML ends so we can figure out that the head is inside HTML and create a tree out of it. Thus we then get something called a parse tree which eventually becomes a DOM tree. The browser parses HTML into a DOM tree.

DOM Tree :

When a web page is loaded, the browser creates a Document Object Model of the page. The HTML DOM model is constructed as a tree of Objects. The backbone of an HTML document is tags. According to the Document Object Model (DOM), every HTML tag is an object. Nested tags are “children” of the enclosing one. The text inside a tag is an object as well. DOM nodes have properties and methods that allow us to travel between them, modify them, move around the page, and more. The text inside a tag is an object as well. An HTML/XML document is represented inside the browser as the DOM tree.

  • Tags become element nodes and form the structure.
  • Text becomes text nodes.
  • …etc, everything in HTML has its place in DOM, even comments.

We can use developer tools to inspect DOM and modify it manually.Lets Understand via Example..so i am taking one test.html file and here is the dom Tree for this particular file

Render Tree Construction :

 Rendering steps include style, layout, paint and, in some cases, compositing. The CSSOM and DOM trees created in the parsing step are combined into a render tree which is then used to compute the layout of every visible element, which is then painted to the screen. In some cases, content can be promoted to their own layers and composited, improving performance by painting portions of the screen on the GPU instead of the CPU, freeing up the main thread.To construct the render tree, the browser roughly does the following:

  1. Starting at the root of the DOM tree, traverse each visible node.

2.  Some nodes are not visible at all (e.g. script tags, meta tags, and so on), and are omitted since they are not reflected in the rendered output.

3.  Some nodes are hidden via CSS and are also omitted from the render tree - e.g. the span node in example above is missing from the render tree because we have an explicit rule that sets “display: none” property on it.

4.  For each visible node find the appropriate matching CSSOM rules and apply them.

5.  Emit visible nodes with content and their computed styles.

Layout Computation :

When the renderer is created and added to the tree, it does not have a position and size. Calculating these values is called layout or reflow.Layout is a recursive process. It begins at the root renderer, which corresponds to the <html> element of the HTML document. Layout continues recursively through some or all of the frame hierarchy, computing geometric information for each renderer that requires it.

The position of the root renderer is 0,0 and its dimensions are the viewport–the visible part of the browser window.

All renderers have a "layout" or "reflow" method, each renderer invokes the layout method of its children that need layout.

The layout usually has the following pattern:

  1. Parent renderer determines its own width.
  2. Parent goes over children and:
  3. Place the child renderer (sets its x and y).
  4. Calls child layout if needed–they are dirty or we are in a global layout, or for some other reason–which calculates the child's height.
  5. Parent uses children's accumulative heights and the heights of margins and padding to set its own height–this will be used by the parent renderer's parent.
  6. Sets its dirty bit to false.

Firefox uses a "state" object(nsHTMLReflowState) as a parameter to layout (termed "reflow"). Among others the state includes the parents width.
The output of the Firefox layout is a "metrics" object(nsHTMLReflowMetrics). It will contain the renderer computed height.

Let's consider a simple hands-on example:

<html>
  <head>
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <title>Critical Path: Hello world!</title>
  </head>
  <body>
    <div style="width: 50%">
      <div style="width: 50%">Hello world</div>
    </div>
  </body>
</html>

The output of the layout process is a “box model” which precisely captures the exact position and size of each element within the viewport.

Painting :

Painting Computes bitmaps and composites to screen. In the painting stage, the render tree is traversed and the renderer's "paint()" method is called to display content on the screen. Painting uses the UI infrastructure component.Now that we know how the things look like and where they should go, we draw some pixels on the screen. Paint actually creates the picture of the layout that needs to be rendered.However, browser painting is special in its own way, as it can happen even without any changes to the DOM and/or CSSOM.

The diagram above was generated using Chrome’s performance panel in DevTools (more on that later) and it shows how much time was taken by each task in the browser in the recorded time (0-7.12s) after reloading of a page. As you can see, painting takes a significant part, and that’s not automatically a bad thing. In this particular example, the increased painting is caused by a combination of animated GIFs on the page and canvas drawing (at 60fps), where both don’t cause any changes to the DOM or its styles, while still triggering painting.

Compositing :

How would you draw a page?

Now that the browser knows the structure of the document, the style of each element, the geometry of the page, and the paint order, how does it draw a page? Turning this information into pixels on the screen is called rasterising.

Perhaps a naive way to handle this would be to raster parts inside of the viewport. If a user scrolls the page, then move the rastered frame, and fill in the missing parts by rastering more. This is how Chrome handled rasterising when it was first released. However, the modern browser runs a more sophisticated process called compositing.

Compositing is a technique to separate parts of a page into layers, rasterise them separately, and composite as a page in a separate thread called compositor thread. If scroll happens, since layers are already rasterised, all it has to do is to composite a new frame., the painting of DOM elements gets done at numerous layers on the page. Once it is complete, the browser combines all the layers into one layer in a correct order and displays them on the screen. This process is especially important for pages with overlapping elements as the incorrect layer composition order may result in an abnormal display of the elements.

Please see the Animation of compositing process

Conclusion:

For the most part, browsers are considered single threaded. For smooth interactions, the developer's goal is to ensure performant site interactions, from smooth scrolling to being responsive to touch. Render time is key, with ensuring the main thread can complete all the work we throw at it and still always be available to handle user interactions. Web performance can be improved by understanding the single-threaded nature of the browser and minimising the main thread's responsibilities, where possible and appropriate, to ensure rendering is smooth and responses to interactions are immediate.

Here we covered the basics, the most used and important actions to start with.It is incredible the number of things that happen for this seemingly simple task to be accomplished. Truly an impressive journey for our little page :)

Read More:

You can read more about other Web topics in our Halodoc blogs at https://blogs.halodoc.io/

Join us

Scalability, reliability, and maintainability are the three pillars that govern what we build at Halodoc Tech. We are actively looking for engineers at all levels and if solving hard problems with challenging requirements is your forte, please reach out to us with your resumé at careers.india@halodoc.com.

About Halodoc

Halodoc is the number 1 all-around Healthcare application in Indonesia. Our mission is to simplify and bring quality healthcare across Indonesia, from Sabang to Merauke. We connect 20,000+ doctors with patients in need through our Tele-consultation service. We partner with 3500+ pharmacies in 100+ cities to bring medicine to your doorstep. We've also partnered with Indonesia's largest lab provider to provide lab home services, and to top it off we have recently launched a premium appointment service that partners with 500+ hospitals that allow patients to book a doctor appointment inside our application. We are extremely fortunate to be trusted by our investors, such as the Bill & Melinda Gates Foundation, Singtel, UOB Ventures, Allianz, GoJek, Astra, Temasek, and many more. We recently closed our Series C round and In total have raised around USD$180 million for our mission. Our team works tirelessly to make sure that we create the best healthcare solution personalised for all of our patient's needs, and are continuously on a path to simplifying healthcare for Indonesia.