hiQ v LinkedIn: the blurred legal frontier of scraping public data
In the digital economy, companies are waging a fierce war on data, with ownership and the scratching of rights over data frequently becoming the cause of disputes around the world. In the recent case of hiQ vs. LinkedIn (2019), the United States Supreme Court overturned the preliminary injunction granted by the district court and referred the case back to the Ninth Circuit for further consideration. Notable public data scraping incident in the United States, the case raises the question of whether social or content platforms should be able to prohibit other operators from scratching (via web crawlers or other means) and use their publicly available user data.
Data analytics company hiQ used web crawlers to extract public personal data such as names, positions, work experience and expertise of LinkedIn users, analyzed the data algorithmically and sold the results to various employers. LinkedIn has given hiQ a cease and desist, demanding that it stop accessing and copying LinkedIn’s server data, and claiming that scraping hiQ data violates federal and state laws, including the Computer Fraud and Abuse Act (CFAA). LinkedIn has also taken a series of steps to ban hiQ from accessing its website.
For its part, hiQ has filed a complaint against LinkedIn and sought a preliminary injunction prohibiting LinkedIn from using legal or technical methods to prevent hiQ from accessing public data. The injunction was granted by the district court on the grounds that the publicly available data on LinkedIn did not fall within the scope of the CFAA and that the scraping of hiQ data did not constitute “unauthorized access. “. Additionally, giving companies with huge user databases free rein to decide who can access and use their data can lead to unfair competition.
The pre-trial detention and subsequent substantial trial of the case gave the court the opportunity to further determine the ownership of public data and the legal limit on data scraping.
Data ownership is the starting point, as well as the central point, of all current disputes involving the scraping of public data. This data is generated by the users, collected and controlled by the platforms and made accessible to the public. So who ultimately owns them – the user, the platform or the audience? There is no consensus. It has also been suggested that the costs of delineating data property protection are simply too high and that alternatives should be considered.
Judicial practice in China
In China’s current judicial practice, similar cases are often placed within the analytical framework of the Unfair Anti-Competition Law, which to some extent avoids the issue of ownership of public data. Courts generally recognize a company’s assertion that data obtained in the course of its operation are valuable business assets and competitive resources, and therefore should be protected by law.
On this basis, the analysis around the general provisions of the Unfair Anti-Competition Law focuses on assessing the legitimacy of the scratching and the use of data, such as: The ease with which this data was obtained in first place ; whether the data has been processed; whether companies have applied crawler protocols or technical means to restrict data scraping by third parties; and whether the scrapers reaped unearned profits by selling homogeneous products. The courts have pursued this angle in both Dianping vs. Baidu (2015) and the most recent Douyin vs. Shuabao (2019), which deals with content scraping.
However, the “unfair competition” approach often emphasizes the rights of economic operators, but neglects the role of user consent in accessing and circulating data, whether authorized or not.
As the owners of the original data, users sharing their information online does not mean that they also agree to their data being collected and used by third parties for any purpose. The wishes of users regarding the location and use of their own data must prevail over the commercial interests of the platforms.
In the case of Weibo vs. Maimai (2015), the court established the principle of “user, platform and user authorization again” for third party access to personal information through an open application programming interface (API). However, when it comes to non-personal information, clarifying and balancing user wishes and platform interests can be a laborious task. For example, if a user allows a third-party platform to access their data, is the platform obligated to facilitate the transfer of data? Or, if the platform insists on denying access or use of this data, will the scraper be acquitted due to the user’s consent? The case of Weibo vs. Toutiao (2017) addressed this issue.
The hiQ case also raises anti-monopoly questions. If a platform controls data that is critical to its competitors’ business model, and that business model and the differentiated products that result from it could benefit the public, does denying others access to this open data lead potentially restricted competition or even a data monopoly?
The complexity of data competition cases arises from multi-level debates about the types, forms of display and use of data. It’s also a delicate balancing act, with business interests on one side, and user choice, open exchange and data sharing, and data security on the other. The data accumulated by the platforms and their hard-earned resource benefits deserve to be protected, but we cannot afford to be overprotective and neglect the interests of consumers, operators and the public for fear that it will do so. does not create unwanted data barriers. As it stands, lawmakers and courts have yet to set a definitive framework or guidelines for businesses to fine-tune their data’s compliance accordingly.
Wang Yaxi is a partner and Wu Yue is a partner at Yuanhe Partners
58F, Fortune Financial Center (FFC)
5 Dongsanhuan Zhonglu, Chaoyang District
Beijing 100020, China
Phone. : +86 10 5733 2388
Fax: +86 10 5733 2399