AI Researchers Create the Largest Database for Studying Career Identity

AI Researchers at Stony Brook University studied biographies of over 51 million Twitter users. Their results have transformed our understanding of what people think of themselves based on what they do for a living, and how they respond when the job market takes a hit.

Stony Brook, NY, Feb 15, 2024 - Researchers at Stony Brook University have argued that social media biographies are a great way to study self-identity. They analyzed self-authored biographies of over 51 million English-language Twitter users over a period of six years. Their paper, titled 'The Evolution of Occupational Identity in Twitter Biographies,' was accepted in AAAI 2024. 

Ph.D. students Xingzhi Guo and Dakota Handzlik, and Distinguished Teaching Professor Steven Skiena, Department of Computer Science collaborated with Jason J. Jones from Stony Brook's Sociology Department to work on this project. 

The team found 435 million biography changes between Feb. 2015 to Jul. 2021, many of which reflected users revising their job titles. To these observations, the researchers added data regarding the prevalence and prestige of each job.

The dataset they constructed is now the largest resource for studying the dynamics of occupational identity.

Skiena said, "Work is an essential part of our daily lives. It significantly contributes to our sense of identity and how we behave around others." For example, a business owner will have a different personality than a school teacher, and a nurse will usually be more patient than a race-car driver.

Because we devote most of our lives to building a career, our occupation becomes a crucial part of who we are. But it is difficult to observe and understand how an individual’s occupational identity evolves over time, which factors affect it, in what ways, and why. This is why the team focused on social media biographies. 

One of their findings posits that when a certain job is considered to be more prestigious than others by society, it will especially likely be included as part of someone’s occupational identity. This is why the data lists more CEOs and Owners than Restaurant Bussers or Clerks, even though we’d expect more workers than executives to show up in the listings.

Another interesting point addressed mobility in occupational identity. Usually, people limit themselves to the same job category, partially because occupational identity directs us to behave as a member of a particular group. For example, a Professor in Computer Science rarely goes on to become a Software Engineer even though she may be able to do so for greater material return.

Findings 2

Examples of job transitions (darker edges represent more observed transitions)

Looking at full-year data from 2022, the researchers also studied how people represented their occupations on Twitter during the pandemic, noting which jobs people were comfortable transitioning between and talking about on social media. 

They noted a dip in the proportion of users including job titles in their self-description in 2020, and it coincided with the spike in unemployment early on in the pandemic.

Pandemic Twitter

Twitter biographies during the COVID-19 pandemic

They also saw that half of the top 10 entry jobs fell under the category of personal care and service occupations, including esthetician, licensed esthetician, doula, health advocate, and tattoo artist.

According to the U.S. Bureau of Labor and Statistics, this group of jobs is projected to grow 14% from 2021 to 2031, nearly double the growth rate of the job market overall. 

On the other hand, the top ten exit jobs largely corresponded to media positions, sports positions, and entertainment, reflecting a category that can be described as “aspirational jobs” — careers with low barriers to entry and title adoption but higher barriers for conversion to a long-term professional vocation.

The team of researchers at Stony Brook has offered a compelling amount of data, and a case study highlighting the impact of a pandemic on work identities. Their work showcases the dataset’s ability to capture intriguing real-world phenomena.

This is the most comprehensive study of social media biography dynamics to date.

Their work holds potential for numerous applications — from helping organizations use social approval mechanisms to their benefit (such as enhancing job prestige, increasing social recognition, and celebrating their employees effectively), to eliminating the need to scrape private resumes from social networks, and mitigating ethical and privacy concerns.

It also enriches the field of economics and contributes to well-informed policy decisions regarding workforce dynamics.

The team plans on building upon this work in several key areas — by studying how self-identity persists, and which part of it is less important and which more; by looking at how “aspirational identities” will evolve; and by observing the future of job dynamics, especially in the GenAI era.

 

Communications Assistant
Ankita Nagpal

This work was partially supported by  NSF grants IIS-1926781, IIS1927227, IIS-1546113 and OAC-1919752.