Both reports left some observers unsettled, though whether the news should have outraged anyone is debatable.
On one hand, when people post information about themselves to message boards, and use their real names, that data is inherently public. Maybe in the past, when Facebook's system was more closed, people could have reasonably expected that at least information on Facebook wouldn't go beyond their friends. But even on Facebook these days, much of what people write is available throughout the Web.
On the other hand, however, many users clearly don't expect that their posts will be investigated by outsiders -- whether the government, marketing companies or human resources directors -- for insights into their personalities or potential buying behavior.
The legal issues of scraping data are extremely murky -- though not necessarily for privacy reasons. When companies allege that scraping is unlawful, they tend to argue that scraping infringes sites' copyrights, or constitutes a trespass, or violates terms of service clauses that forbid accessing the sites through automated means.
Those allegations are central to a pending lawsuit about scraping by Facebook against Power.com. The latter company aggregates information from a variety of social networking sites, enabling users with accounts at services like Orkut, MySpace, LinkedIn and Twitter to access their information from one portal. To do so, Power asks users to provide log-in information for their social networking sites and then imports their information.
Facebook objects to the practice, arguing that Power is violating a federal computer fraud law by scraping. A judge recently dismissed some of Facebook's claims, but ruled that Power could be liable if it circumvented technical barriers on Facebook. Facebook has publicly said that Power's technology could threaten members' privacy because Power enables users to easily transfer photos or messages marked "private" to other social networking services. Power counters that users have the ability to do this manually anyway and that Facebook's objection to the practice is driven by the desire to keep control over the data. (Facebook recently released a tool allowing users to download some -- but, critically, not all -- of the information associated with their profiles with a single click.)
As that lawsuit continues through the courts, questions continue to swirl over the legality of accessing data on social networking sites. Even without using automatic scrapers, companies can potentially violate a site's terms of service by collecting data about members. Consider, Facebook's statement of rights and responsibilities includes this directive: "If you collect information from users, you will: obtain their consent, make it clear you (and not Facebook) are the one collecting their information, and post a privacy policy explaining what information you collect and how you will use it."
It's not clear whether courts would enforce this provision against, say, a company that's monitoring the site in order to gather intelligence about job applicants. As a practical matter, however, it's probably impossible to prevent people from manually collecting information that users have themselves made available in a public forum.
Great article!! Thank you. My company is about to undertake the scraping of data from social networks to employers in an aggregated format but we're thinking to only record "sentiment" and not republish any content or photos. I believe we will still be protecting the rights of Facebook and other networks.
People are stupid. From how many more sources how many times do people have to be told not to post anything they don't want any particular person not to see for them not to do it? Stupid, stupid, stupid.