If you are a user of websites with European domains, you probably know how the ‘cookie wall’ looks like. The cookie wall is a message that the website is using the cookies. But do you know what cookies are and how exactly they work?
Before we start talking about cookies, we need to explain something. HTTP communication is stateless. What does that mean?
The HTTP protocol is used for communication between the client (in this case browser) and the server. It is a communication based on request-response messages. Stateless in this protocol means that we cannot store any data that can be used by many massages. This in turn means that the next request does not know what happened in the previous requests. For example, this is a problem if we have authentication logic. We send the first request with some user data needed to log in, the server checks this data and sends back the information that everything is fine, this user exists and can use the app. We send another request, this time for some specified data. We don’t have any state with information from the authentication request, so the server doesn’t know if the user sending this request has permission to read these resources.
Cookies in practise
Now when you know how it works in a nutshell, let’s follow this process in practice. Let’s use my favourite example: www.google.com.
If we open the developer tools on google site and refresh the page, we can see all the requests sent by the browser. Let’s focus on the first one, which just asks for the main page.
We can see here that inside the server’s response, there are two headers with the `set-cookie` key. These two values will be added to every other request sent to this domain. But not the whole values, because part of this string is responsible for the cookie settings that define how to use them. Let’s take a look at this in the second header.
CONSENT=PENDING+155; expires=Fri, 01-Jan-2038 00:00:00 GMT; path=/; domain=.google.com
Let’s break this down into parts.
CONSENT=PENDING+155 - this is the main part of our cookie. The key-value pair which will be included in further requests
expires=Fri, 01-Jan-2038 00:00:00 GMT - this declares an expiration date. This means that the above value will be added to every request to this domain until the year 2038. After this time the cookie will be deleted from the browser
path=/ - this means that cookies will be sent regardless of the path in the URL. (For example, it will also be sent if the request goes to www.google.com/anything)
domain=.google.com - the cookie will be sent to this domain. The dot at the beginning means that all subdomains are included
Now let’s look at the next request. To be more specific, we are interested in the headers.
As we can see, the next request has the header ‘cookie’ with a value containing two cookies that we received in the first response:
Key NID with the value of some long string of random characters
Key CONSENT with the value of PENDING+155
We can also check all cookies for this domain in the developer tools. Here we can also see that the values are similar to these above.
We can also see that each key-value pair can have more settings than we described earlier.
Additional settings for cookies
Expires — specific date, after which the browser should delete this cookie. If the cookie in the server response does not have this setting, this cookie is set as a session cookie. This means that the browser will delete it when we close the tab with the page.
Max-Age — maximum time of keeping the cookie, specified in seconds. It has a higher priority than ‘expires’. If we set this key to a value equal to or less than 0, the cookie will be deleted immediately. This is a good way to remove cookies, that have already been stored.
Secure — the cookie will only be added to secure requests (HTTPS).
Path — if the path is set, cookies will only be added to requests that contain this path in the URL.
Domain — with this key we can specify which domain is allowed to receive this cookie. This way we can exclude some subdomains.
Same Site is a special type of key, that sets the policy on how cookies should be used and sent. Let’s say we are on the xyz.com site and generate a request to the abc.com site, where we were before, so we have some cookies stored. For this case we can use the same-site option, which can be set to one of 3 values:
- none — default one. Adds the cookies to every request for this domain, regardless of the source.
- strict — do not add the cookies if the source of the request to our domain is different from this domain
- lack — here the rule is a bit more complicated, in short: add the cookie to the request if the user is redirected to our domain from another domain.
Nowadays, cookies are used not only to help the user, but more importantly to collect a lot of information about us, which is necessary for better profiling of users. In this day and age where more and more industries are moving to the internet, creating a good profile of the potential buyer is very valuable for the business.
The last thing I want to write about is a supercookie. Standard cookies have many restrictions such as a maximum size of 4Kb or easy removal. In response to these limitations, the new, better cookie was born. The supercookie.
The supercookie is a cookie that is stored outside of the browser memory, so it is not so easy to delete. It is used to collect information about a user, but also to create a zombie-cookie.
The zombie cookie is a cookie that is automatically recreated when someone deletes it.
I hope you now have the basic knowledge of how cookies work.