RESTful API Designing guidelines

2018-04-02 web

/api/v1/<resources>/:id/<sub-resources>/:sub-id

resource object

  1. API object consistent to DB object including field_name
  2. lowercase field name, divided by underscore
  3. if resource object refer to several files, use file array to achieve scalable.
  4. use custom_data to enforce scalability of customed parameters. e.g.
     {
         "id": "a31469f3-5545-4be9-b640-f4466e51785a",
         "doc_type": "design",
         "version": 1,
         "schema_version": 1,
         "created_at": "2016-06-16T05:49:08.826Z",
         "created_by": "u873",
         "modified_at": "2016-06-17T05:49:08.826Z",
         "modified_by": "u873",
         "due_at": "2016-06-18T05:49:08.826Z",
         "files":[
             {"type": "audio", "id": "xxx", "url":" http://xxx"},
             {"type": "image",  "id": "xxx", "url":" http://xxx"}
         ],
         "custom_data":{
             "key1": "value1"
         }
     }
    

Facebook, Google, Github, Netflix and few other tech giants have given a chance to the developers and products to consume their data through APIs, and became a platform for them. Even if you are not writing APIs for other developers and products, it is always very healthy for your application to have beautifully crafted APIs.

There is a long debate going on the internet, about the best ways to design the APIs, and is one of the most nuanced. There are no official guidelines defined for the same.

The API is an interface, through which many developers interact with the data. A good designed API is always very easy to use and makes the developer’s life very smooth. API is the GUI for developers, if it is confusing or not verbose, then the developer will start finding the alternatives or stop using it. Developers’ experience is the most important metric to measure the quality of the APIs.

The API is like an artist performing on stage, and its users are the audience

Terminologies

The following are the most important terms related to REST APIs

  • Resource is an object or representation of something, which has some associated data with it and there can be set of methods to operate on it. E.g. Animals, schools and employees are resources and delete, add, update are the operations to be performed on these resources.
  • Collections are set of resources, e.g Companies is the collection of Company resource.
  • URL (Uniform Resource Locator) is a path through which a resource can be located and some actions can be performed on it.

API endpoint

Let’s write few APIs for Companies which has some Employees, to understand more.
/getAllEmployees is an API which will respond with the list of employees. Few more APIs around a Company will look like as follows:

  • /addNewEmployee
  • /updateEmployee
  • /deleteEmployee
  • /deleteAllEmployees
  • /promoteEmployee
  • /promoteAllEmployees

And there will be tons of other API endpoints like these for different operations. All of those will contain many redundant actions. Hence, all these API endpoints would be burdensome to maintain, when API count increases.

What is wrong?
The URL should only contain resources(nouns) not actions or verbs. The API path/addNewEmployee contains the action addNew along with the resource name Employee.

Then what is the correct way?
/companies endpoint is a good example, which contains no action. But the question is how do we tell the server about the actions to be performed on companies resource viz. whether to add, delete or update?

This is where the HTTP methods (GET, POST, DELETE, PUT), also called as verbs, play the role.

The resource should always be plural in the API endpoint and if we want to access one instance of the resource, we can always pass the id in the URL.

  • method GET path /companies should get the list of all companies
  • method GET path /companies/34 should get the detail of company 34
  • method DELETE path /companies/34 should delete company 34

In few other use cases, if we have resources under a resource, e.g Employees of a Company, then few of the sample API endpoints would be:

  • GET /companies/3/employees should get the list of all employees from company 3
  • GET /companies/3/employees/45 should get the details of employee 45, which belongs to company 3
  • DELETE /companies/3/employees/45 should delete employee 45, which belongs to company 3
  • POST /companies should create a new company and return the details of the new company created Isn’t the APIs are now more precise and consistent? :)

Conclusion: The paths should contain the plural form of resources and the HTTP method should define the kind of action to be performed on the resource.

HTTP methods (verbs)

HTTP has defined few sets of methods which indicates the type of action to be performed on the resources.

The URL is a sentence, where resources are nouns and HTTP methods are verbs.
The important HTTP methods are as follows:

  1. GET method requests data from the resource and should not produce any side effect.
    E.g /companies/3/employees returns list of all employees from company 3.
  2. POST method requests the server to create a resource in the database, mostly when a web form is submitted.
    E.g /companies/3/employees creates a new Employee of company 3.
    POST is non-idempotent which means multiple requests will have different effects.
  3. PUT method requests the server to update resource or create the resource, if it doesn’t exist.
    E.g. /companies/3/employees/john will request the server to update, or create if doesn’t exist, the john resource in employees collection under company 3. PUT is idempotent which means multiple requests will have the same effects.
  4. DELETE method requests that the resources, or its instance, should be removed from the database.
    E.g /companies/3/employees/john/ will request the server to delete john resource from the employees collection under the company 3.
    There are few other methods which we will discuss in another post.

HTTP response status codes

When the client raises a request to the server through an API, the client should know the feedback, whether it failed, passed or the request was wrong. HTTP status codes are bunch of standardized codes which has various explanations in various scenarios. The server should always return the right status code. The following are the important categorization of HTTP codes:

2xx (Success category)

These status codes represent that the requested action was received and successfully processed by the server.

  • 200 Ok The standard HTTP response representing success for GET, PUT or POST.
  • 201 Created This status code should be returned whenever the new instance is created. E.g on creating a new instance, using POST method, should always return 201 status code.
  • 204 No Content represents the request is successfully processed, but has not returned any content. DELETE can be a good example of this. The API DELETE /companies/43/employees/2 will delete the employee 2 and in return we do not need any data in the response body of the API, as we explicitly asked the system to delete. If there is any error, like if employee 2 does not exist in the database, then the response code would be not be of 2xx Success Category but around 4xx Client Error category.

3xx (Redirection Category)

  • 304 Not Modified indicates that the client has the response already in its cache. And hence there is no need to transfer the same data again.

    4xx (Client Error Category)

    These status codes represent that the client has raised a faulty request.

  • 400 Bad Request indicates that the request by the client was not processed, as the server could not understand what the client is asking for.
  • 401 Unauthorized indicates that the client is not allowed to access resources, and should re-request with the required credentials.
  • 403 Forbidden indicates that the request is valid and the client is authenticated, but the client is not allowed access the page or resource for any reason. E.g sometimes the authorized client is not allowed to access the directory on the server.
  • 404 Not Found indicates that the requested resource is not available now.
  • 410 Gone indicates that the requested resource is no longer available which has been intentionally moved.

    5xx (Server Error Category)

  • 500 Internal Server Error indicates that the request is valid, but the server is totally confused and the server is asked to serve some unexpected condition.
  • 503 Service Unavailable indicates that the server is down or unavailable to receive and process the request. Mostly if the server is undergoing maintenance.

Field name casing convention

You can follow any casing convention, but make sure it is consistent across the application. If the request body or response type is JSON then please follow camelCase to maintain the consistency.

Searching, sorting, filtering and pagination

All of these actions are simply the query on one dataset. There will be no new set of APIs to handle these actions. We need to append the query params with the GET method API.
Let’s understand with few examples how to implement these actions.

  • Sorting In case, the client wants to get the sorted list of companies, the GET /companies endpoint should accept multiple sort params in the query.
    E.g GET /companies?sort=rank_asc would sort the companies by its rank in ascending order.
  • Filtering For filtering the dataset, we can pass various options through query params.
    E.g GET /companies?category=banking&location=india would filter the companies list data with the company category of Banking and where the location is India.
  • Searching When searching the company name in companies list the API endpoint should be GET /companies?search=Digital Mckinsey
  • Pagination When the dataset is too large, we divide the data set into smaller chunks, which helps in improving the performance and is easier to handle the response.
    Eg. GET /companies?page=23 means get the list of companies on 23rd page.

If adding many query params in GET methods makes the URI too long, the server may respond with 414 URI Too long HTTP status, in those cases params can also be passed in the request body of the POST method.

Versioning

When your APIs are being consumed by the world, upgrading the APIs with some breaking change would also lead to breaking the existing products or services using your APIs.

http://api.yourservice.com/v1/companies/34/employees is a good example, which has the version number of the API in the path. If there is any major breaking update, we can name the new set of APIs as v2 or v1.x.x

Reference:



Shorten Long Links

2018-01-23 web

主要催生事件:Twitter 140-character restriction(国内微博140字限制)。 使用短链接的好处:短,美观,更好控制更好统计客户信息,减少垃圾外链。

Try shorten url web service

Powered by sina: http://dwz.wailian.work/ ,both domestic and global。

短链接算法基本都是一个套路,将自增ID转为62进制:

  1. 获取该条记录的自增ID
  2. 将自增转换为62进制,并拼接网址 如:http://qetee.com/w7e
  3. 用户访问到 http://qetee.com/w7e时,提取短网址后缀 w7e
  4. 将短网址后缀转换为10进制,得到自增ID号 如:123456
  5. 使用查询该记录,进行业务逻辑处理(比如跳转)

Code implementation

namespace Alphabet 
{
    public class AlphabetTest 
    {
        public static readonly string Alphabet = "abcdefghijklmnopqrstuvwxyz0123456789";
        public static readonly int Base = Alphabet.Length;

        public static string Encode(int i)
        {
            if (i == 0) return Alphabet[0].ToString();

            var s = string.Empty;

            while (i > 0)
            {  
                s += Alphabet[i % Base];
                i = i / Base;
            }

            return string.Join(string.Empty, s.Reverse());
        }

        public static int Decode(string s)
        {
            var i = 0;

            foreach (var c in s)
            {
                i = (i * Base) + Alphabet.IndexOf(c);
            }

            return i;
        }

        public static void Main(string[] args) 
        {
            // Simple test of encode/decode operations
            for (var i = 0; i < 10000; i++) 
            {
                if (Decode(Encode(i)) != i) 
                {
                    System.Console.WriteLine("{0} is not {1}", Encode(i), i);
                    break;
                }
            }
        } 
    }
}

saas comparation

service name pros cons eg.
Bitly.com top rated   http://bit.ly/2eJynMi
Goo.gl google’s   https://goo.gl/PkEbhG
TinyURL.com customize long https://tinyurl.com/ezhome-hs

Conclusion

国际版可以用bitly,使用的人多,有访客分析
国内暂没发现成熟的,cmcc.in
看起来是个小作坊不太可靠,有两种选择,更倾向于第二种,数据掌握在自己的手里

  1. 跟国内新浪微博、淘宝、百度等大厂合作,大厂不轻易倒闭或暂停服务,服务可靠
  2. 自己搭,技术门槛不是很高

Reference:


Emoji support when using mysql

2017-10-20 database

Checking our feedback email, i found that lots of overseas users wish to save emoji/emoticon in their profile and design. Yet our apps currently don’t support. So i researched how to support emoji.

database -something special when using mysql

I was using utf8 encoding in MySQL which I thought (wrongly) could represent all unicode characters.
The unicodes for emoticons are fully supported by the UTF-8 encoding; however, MySQL’s utf8 does not! To save emoticons to a MySQL database we need to use utf8mb4.
This is what the MySQL documentation has to say about it:

The difference between MySQL’s utf8 and utf8mb4 is that the former can only store 3 byte characters whereas the latter can store 4 byte ones. Therefore with utf8 we can only store unicode characters from the Basic Multilingual Plane. Put more simply, utf8 is suitable for characters from the majority of modern languages and some symbols. Emoticon characters exist in the Supplementary Multilingual Plane for which we need to use utf8mb4.

Luckily, MySQL 5.5.3 (released in early 2010) introduced a new encoding called utf8mb4 which maps to proper UTF-8 and thus fully supports Unicode, including astral symbols.
The utf8mb4 Character Set (4-Byte UTF-8 Unicode Encoding)

Step 1: Create a backup

Create a backup of all the databases on the server you want to upgrade. Safety first!

Step 2: Upgrade the MySQL server

Upgrade the MySQL server to v5.5.3+, or ask your server administrator to do it for you.

Step 3: Modify databases, tables, and columns

Change the character set and collation properties of the databases, tables, and columns to use utf8mb4 instead of utf8.

# For each database:
ALTER DATABASE database_name CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
# For each table:
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
# For each column:
ALTER TABLE table_name CHANGE column_name column_name VARCHAR(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Since utf8mb4 is fully backwards compatible with utf8, no mojibake or other forms of data loss should occur. (But you have a backup, right?)

Step 4: Check the maximum length of columns and index keys

Step 5: Modify connection, client, and server character sets

  <connectionStrings>
    <add name="HSMDbEntities"
    connectionString="metadata=res://*/Data.HSMDbModel.csdl|res://*/Data.HSMDbModel.ssdl|res://*/Data.HSMDbModel.msl;provider=Devart.Data.MySql;provider connection string=&quot;server=rm-domain.mysql.rds.aliyuncs.com;user id=*#uid#*;password=*#pwd#*;persist security info=True;database=hsmdb;CharSet=utf8mb4;Pooling=true;Min Pool Size=2;Max Pool Size=30;&quot;" providerName="System.Data.EntityClient"
  xdt:Transform="SetAttributes" xdt:Locator="Match(name)"/>
  </connectionStrings>
show VARIABLES like 'ver%'
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
====result====
character_set_client	utf8mb4
character_set_connection	utf8mb4
character_set_database	utf8
character_set_filesystem	binary
character_set_results	utf8mb4
character_set_server	utf8
character_set_system	utf8
collation_connection	utf8mb4_general_ci
collation_database	utf8_general_ci
collation_server	utf8_general_ci

Step 6: Repair and optimize all tables

For each table
REPAIR TABLE table_name;
OPTIMIZE TABLE table_name;
$ mysqlcheck -u root -p –auto-repair –optimize –all-databases

Reference:



ms17

Software Engineer and Full Stack Developer, from Shanghai, China.