CHAPTER 8: DESIGN A URL SHORTENER

Step 1 - Understand the problem and establish design scope

Candidate: Can you give an example of how a URL shortener work?
Interviewer: Assume URL
https://www.systeminterview.com/q=chatsystem&c=loggedin&v=v3&l=long is the original
URL. Your service creates an alias with shorter length: https://tinyurl.com/ y7keocwj. If you
click the alias, it redirects you to the original URL.
Candidate: What is the traffic volume?
Interviewer: 100 million URLs are generated per day.
Candidate: How long is the shortened URL?
Interviewer: As short as possible.
Candidate: What characters are allowed in the shortened URL?
Interviewer: Shortened URL can be a combination of numbers (0-9) and characters (a-z, AZ).
Candidate: Can shortened URLs be deleted or updated?
Interviewer: For simplicity, let us assume shortened URLs cannot be deleted or updated.
Here are the basic use cases:
1.URL shortening: given a long URL => return a much shorter URL
2.URL redirecting: given a shorter URL => redirect to the original URL
3.High availability, scalability, and fault tolerance considerations

Back of the envelope estimation

• Write operation: 100 million URLs are generated per day.
• Write operation per second: 100 million / 24 /3600 = 1160
• Read operation: Assuming ratio of read operation to write operation is 10:1, read
operation per second: 1160 * 10 = 11,600
• Assuming the URL shortener service will run for 10 years, this means we must support
100 million * 365 * 10 = 365 billion records.
• Assume average URL length is 100.
• Storage requirement over 10 years: 365 billion * 100 bytes * 10 years = 365 TB
It is important for you to walk through the assumptions and calculations with your
interviewer so that both of you are on the same page.

Step 2 - Propose high-level design and get buy-in

API Endpoints

1.URL shortening. To create a new short URL, a client sends a POST request, which contains
one parameter: the original long URL. The API looks like this:
POST api/v1/data/shorten
• request parameter: {longUrl: longURLString}
• return shortURL
2.URL redirecting. To redirect a short URL to the corresponding long URL, a client sends a
GET request. The API looks like this:
GET api/v1/shortUrl
• Return longURL for HTTP redirection

URL redirecting

CHAPTER 8: DESIGN A URL SHORTENER_第1张图片
Figure 8-1 shows what happens when you enter a tinyurl onto the browser. Once the server receives a tinyurl request, it changes the short URL to the long URL with 301 redirect.
CHAPTER 8: DESIGN A URL SHORTENER_第2张图片

301 redirect. A 301 redirect shows that the requested URL is “permanently” moved to the long URL. Since it is permanently redirected, the browser caches the response, and subsequent requests for the same URL will not be sent to the URL shortening service. Instead, requests are redirected to the long URL server directly.
302 redirect. A 302 redirect means that the URL is “temporarily” moved to the long URL, meaning that subsequent requests for the same URL will be sent to the URL shortening service first. Then, they are redirected to the long URL server.

The most intuitive way to implement URL redirecting is to use hash tables.

URL shortening

CHAPTER 8: DESIGN A URL SHORTENER_第3张图片
• Each longURL must be hashed to one hashValue.
• Each hashValue can be mapped back to the longURL.

Step 3 - Design deep dive

Data model

In the high-level design, everything is stored in a hash table.
A better option is to store mapping in a relational
database.
CHAPTER 8: DESIGN A URL SHORTENER_第4张图片

Hash function

The hashValue consists of characters from [0-9, a-z, A-Z], containing 10 + 26 + 26 = 62 possible characters. To figure out the length of hashValue, find the smallest n such that 62^n ≥ 365 billion.
CHAPTER 8: DESIGN A URL SHORTENER_第5张图片
CHAPTER 8: DESIGN A URL SHORTENER_第6张图片
even the shortest hash value (from CRC32) is too long (more than 7
characters). How can we make it shorter?
CHAPTER 8: DESIGN A URL SHORTENER_第7张图片
still too expensive, use bloom filter(method to test if something is in a set)

Base 62 conversion

CHAPTER 8: DESIGN A URL SHORTENER_第8张图片

CHAPTER 8: DESIGN A URL SHORTENER_第9张图片

URL shortening deep dive

CHAPTER 8: DESIGN A URL SHORTENER_第10张图片

The distributed unique ID generator is worth mentioning.

URL redirecting deep dive

As there are more reads than writes, mapping is stored in a cache to improve performance.
CHAPTER 8: DESIGN A URL SHORTENER_第11张图片

Step 4 - Wrap up

If there is extra time at the end of the interview, here are a few additional talking points.
• Rate limiter: A potential security problem we could face is that malicious users send an overwhelmingly large number of URL shortening requests. Rate limiter helps to filter out requests based on IP address or other filtering rules. If you want to refresh your memory
about rate limiting, refer to “Chapter 4: Design a rate limiter”.
• Web server scaling: Since the web tier is stateless, it is easy to scale the web tier by adding or removing web servers.
• Database scaling: Database replication and sharding are common techniques.
• Analytics: Data is increasingly important for business success. Integrating an analytics solution to the URL shortener could help to answer important questions like how many people click on a link? When do they click the link? etc.
• Availability, consistency, and reliability. These concepts are at the core of any large
system’s success. We discussed them in detail in Chapter 1, please refresh your memory on these topics.

你可能感兴趣的:(System,Design,系统架构)