Twitter tweets scrapping model (Fall 2022)

December 5, 2022 17 minute read

[Notice] Journey to the academic researcher This is the story of how I became the insightful researcher.

This code can scrape the tweets’ data from the Twitter server. Before implementing this code, users need to register their accounts to developer accounts and get various keys to free access to the server. Users should explain their purposes for using Twitter API to get the keys. If they sufficiently explain their goals or the goals are not suitable to utilize the API, Twitter will deny their registration. Users without authorization can not use all API functions and will be strictly restricted by Twitter. Furthermore, even though users get full authorization, they only can request 900 data per 15mins to mitigate Twitter servers’ overworks.

import pandas as pd
import tweepy
import ssl
from tqdm import tqdm
ssl._create_default_https_context = ssl._create_unverified_context
import time

consumer_key = "ENTER YOUR KEY"
consumer_secret = "ENTER YOUR KEY"
access_key = "ENTER YOUR KEY"
access_secret = "ENTER YOUR KEY"

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
client  = tweepy.Client(bearer_token = 'ENTER YOUR KEY', wait_on_rate_limit=True)

Search_all_tweets version

database = pd.read_excel('Twitter_Official_1009_update22.xlsx', sheet_name='base')
database

	news_date	End	Start	Official Account	Add	Unnamed: 5
0	2018-02-27	2021-06-22	2017-08-27	cvspharmacy	2019-05-03	True
1	2018-04-03	2020-10-21	2017-10-03	dominos	2019-05-07	True
2	2018-04-10	2021-06-17	2017-10-10	Nordstrom	2018-03-29	True
3	2018-02-01	2021-06-16	2017-08-01	kroger	2019-03-24	True
4	2018-02-23	2021-05-10	2017-08-23	Kohls	2018-02-13	True
5	2018-02-01	2021-06-09	2017-08-01	Lowes	2019-10-23	True
6	2019-10-14	2021-03-12	2019-04-14	lululemon	2019-09-28	True
7	2018-02-12	2021-06-30	2017-08-12	Starbucks	2019-09-02	True
8	2018-01-23	2021-06-21	2017-07-23	ATT	2019-06-25	True
9	2018-02-05	2021-06-08	2017-08-05	TDBank_US	2018-07-13	True
10	2018-01-24	2021-06-30	2017-07-24	Target	2017-11-09	True
11	2018-02-06	2021-03-24	2017-08-06	Tmobile	2018-09-09	True
12	2018-02-13	2020-10-21	2017-08-13	ultabeauty	2017-09-30	True
13	2018-03-15	2020-08-11	2017-09-15	Walmart	2018-12-19	True

column_names = ["Account","gen_date", "text", "in_url", "in_media", "hash_text", "hash_count", "ret_count", "fav_count"]
tw_data = pd.DataFrame(columns = column_names)
tw_data

	Account	gen_date	text	in_url	in_media	hash_text	hash_count	ret_count	fav_count

req_counter = 0
for idx, data in tqdm(database.iterrows()):
    tweet_ex = client.search_all_tweets(query="from:"+data['Official Account']+" lang:en -is:retweet -is:reply -is:quote -is:nullcast", \
                                          start_time=data['Start'], end_time=data['Add'], max_results=500)
    acc_name = data['Official Account']
    if tweet_ex[0] == None:
        continue
    for tweet in tweet_ex[0]:
        # check whether the number of requests arrived at the limit
        req_counter += 1
        if req_counter == 900:
            time.sleep(901)
            req_counter = 0
            
        target_id = tweet.id
        ex_stat = api.get_status(target_id)

        gen_date = ex_stat._json['created_at']
        cont = ex_stat._json['text']
        try:
            urls = ex_stat.entities['urls'][0]
            in_url = urls['url']
        except:
            in_url = "No URL included"
        try:
            media = ex_stat.entities['media'][0]
            in_media = media['type']
        except:
            in_media = "No Media included"
        hash_cont = []
        for hashtag in ex_stat.entities['hashtags']:
            hash_cont.append(hashtag['text'])
        hash_num = len(ex_stat.entities['hashtags'])
        ret_num = ex_stat._json['retweet_count']
        fav_num = ex_stat._json['favorite_count']
#         for i in client.search_all_tweets(query="in_reply_to_status_id: "+str(target_id), \
#                                           start_time=data['Start'], end_time=data['End'], max_results=200):
#             rep_counter += 1
#         rep_num = rep_counter

        rows = [acc_name, gen_date, cont, in_url, in_media, hash_cont, hash_num, ret_num, fav_num]
        tw_data.loc[len(tw_data)] = rows

14it [04:59, 21.43s/it]

tw_data

	Account	gen_date	text	in_url	in_media	hash_text	hash_count	ret_count	fav_count
0	cvspharmacy	Tue Jun 25 13:30:32 +0000 2019	Using @SpaRoomProducts' therapeutic 100% Pure ...	https://t.co/Qlq5WI3pFD	No Media included	[]	0	0	6
1	cvspharmacy	Fri Jun 21 19:38:29 +0000 2019	☀️ 🖍️ Kick off summer with our free coloring ...	https://t.co/vdzONMLBRp	photo	[]	0	3	13
2	cvspharmacy	Wed Jun 19 13:34:45 +0000 2019	Up to 50% of Americans don’t take their medica...	https://t.co/2M2B3fK3yF	No Media included	[]	0	4	15
3	cvspharmacy	Tue Jun 18 15:30:52 +0000 2019	Slide into the season without sneezing. We del...	https://t.co/CszOREMaZb	No Media included	[]	0	3	7
4	cvspharmacy	Sun Jun 16 13:30:01 +0000 2019	Thanks for everything you do, dads! Happy Fath...	No URL included	photo	[]	0	5	12
...	...	...	...	...	...	...	...	...	...
474	Walmart	Fri Dec 21 18:49:02 +0000 2018	Not only is @MartinaMcBride a country supersta...	https://t.co/Gh4XcaI0ow	No Media included	[]	0	6	22
475	Walmart	Fri Dec 21 17:52:32 +0000 2018	If you know, you know. https://t.co/F94z95mCGT	https://t.co/F94z95mCGT	No Media included	[]	0	12	52
476	Walmart	Thu Dec 20 21:02:11 +0000 2018	In case Santa doesn’t get your letter, just se...	https://t.co/aXOWOFVH72	No Media included	[]	0	11	33
477	Walmart	Thu Dec 20 20:07:34 +0000 2018	Note to self: check for Cheeto-fingers before ...	https://t.co/MvG0Sa9Y9m	No Media included	[WalmartTopSeller]	1	5	18
478	Walmart	Thu Dec 20 18:08:46 +0000 2018	It’s not over ‘til it’s over. And our 20 Days ...	https://t.co/JarvS9CPuP	No Media included	[]	0	3	22

479 rows × 9 columns

tw_data.to_csv("FinalCheck_cycle22.csv")

PAGINATION & Search all tweets

database = pd.read_excel('Twitter_Official_1012_update25.xlsx', sheet_name='base')
database

	news_date	End	Start	Official Account	Add	Unnamed: 5
0	2018-02-15	2018-08-15	2017-08-15	AMTDGroup	2018-01-25	True
1	2018-04-06	2021-04-07	2017-10-06	BestBuy	2017-11-15	True
2	2018-02-22	2021-01-01	2017-08-22	Avis	2020-05-07	True
3	2018-02-06	2021-06-07	2017-08-06	ChipotleTweets	2021-03-24	True
4	2018-04-20	2019-07-15	2017-10-20	Designer_Brands	2018-06-19	True
5	2018-08-29	2021-05-18	2018-02-28	DollarGeneral	2018-03-18	True
6	2018-03-01	2019-12-18	2017-09-01	darden	2017-09-12	True
7	2020-10-04	2021-06-22	2020-04-04	darden	2021-01-26	True
8	2018-05-27	2021-06-28	2017-11-27	Ford	2021-03-07	True
9	2018-09-05	2019-07-15	2018-03-05	FastenalCompany	2018-04-01	True
10	2019-01-08	2020-06-21	2018-07-08	footlocker	2018-08-13	True
11	2018-07-06	2019-12-13	2018-01-06	Genesco_Inc	2019-04-30	True
12	2018-01-25	2020-10-02	2017-07-25	Hyatt	2017-08-15	True
13	2018-10-18	2019-04-18	2018-04-18	habitburger	2019-04-18	True
14	2018-04-25	2019-07-15	2017-10-25	HDSupply	2017-11-01	True
15	2018-01-11	2021-02-27	2017-07-11	HiltonHotels	2021-02-27	True
16	2020-04-27	2021-02-03	2019-10-27	HRBlock	2019-11-13	True
17	2018-08-23	2021-06-01	2018-02-23	HSBC	2018-03-07	True
18	2018-07-17	2021-01-02	2018-01-17	Labcorp	2020-03-19	True
19	2019-11-12	2020-05-12	2019-05-12	ElPolloLoco	2019-07-06	True
20	2018-01-18	2021-06-10	2017-07-18	lukoilengl	2020-03-03	True
21	2018-02-05	2018-08-05	2017-08-05	lululemon	2017-08-13	True
22	2018-04-01	2021-04-05	2017-10-01	Macys	2020-08-31	True
23	2018-03-09	2021-06-04	2017-09-09	Marriott	2017-10-23	True
24	2018-01-17	2019-07-15	2017-07-17	MurphyUSA	2017-08-09	True
25	2018-01-17	2019-02-27	2017-08-31	MurphyUSA	2019-02-27	True
26	2020-08-10	2021-03-09	2020-02-10	MurphyUSA	2020-10-02	True
27	2018-03-15	2021-06-29	2017-09-15	Nike	2021-06-29	True
28	2019-04-18	2019-10-18	2018-10-18	OlliesOutlet	2018-12-19	True
29	2019-07-26	2020-01-26	2019-01-26	BankOZK	2019-02-18	True
30	2018-04-02	2021-05-19	2017-10-02	PolarisInc	2017-10-09	True
31	2018-04-10	2021-06-22	2017-10-10	childrensplace	2018-09-23	True
32	2018-11-30	2019-09-06	2018-05-30	PlanetFitness	2019-03-31	True
33	2018-02-01	2021-06-07	2017-08-01	PNCBank	2017-08-30	True
34	2018-01-26	2021-05-06	2017-07-26	PVHCorp	2018-05-18	True
35	2018-03-19	2018-10-20	2018-03-19	SportsmansWH	2018-10-20	True
36	2019-05-08	2020-12-10	2018-11-08	SportsmansWH	2019-10-29	True
37	2018-03-07	2020-05-04	2017-09-07	TruistNews	2020-01-29	True
38	2020-12-02	2021-06-02	2020-06-02	DelTaco	2020-06-24	True
39	2018-03-22	2019-07-15	2017-09-22	TractorSupply	2018-04-29	True
40	2018-03-15	2021-04-04	2017-09-15	Wendys	2020-02-19	True
41	2018-01-11	2021-02-14	2017-07-11	WolverineWW	2017-09-07	True
42	2019-05-08	2020-12-10	2018-11-08	SportsmansWH	2019-10-29	True
43	2018-01-25	2021-06-22	2017-07-25	riteaid	2018-03-21	True

column_names = ["Account","gen_date", "text", "in_url", "in_media", "hash_text", "hash_count", "ret_count", "fav_count"]
tw_data = pd.DataFrame(columns = column_names)
tw_data

	Account	gen_date	text	in_url	in_media	hash_text	hash_count	ret_count	fav_count

req_counter = 0
for idx, data in tqdm(database.iterrows()):
    acc_name = data['Official Account']
    for tweet in tweepy.Paginator(client.search_all_tweets, query= "from: "+data['Official Account']+ " lang:en -is:retweet -is:reply -is:quote -is:nullcast",
                                start_time=data['Start'], end_time=data['Add'], max_results=500).flatten(limit=2000):
        req_counter += 1
        if tweet == None:
            continue

        if req_counter == 900:
            time.sleep(901)
            req_counter = 0
        
        target_id = tweet.id
        ex_stat = api.get_status(target_id)

        gen_date = ex_stat._json['created_at']
        cont = ex_stat._json['text']
        try:
            urls = ex_stat.entities['urls'][0]
            in_url = urls['url']
        except:
            in_url = "No URL included"
        try:
            media = ex_stat.entities['media'][0]
            in_media = media['type']
        except:
            in_media = "No Media included"
        hash_cont = []
        for hashtag in ex_stat.entities['hashtags']:
            hash_cont.append(hashtag['text'])
        hash_num = len(ex_stat.entities['hashtags'])
        ret_num = ex_stat._json['retweet_count']
        fav_num = ex_stat._json['favorite_count']

        rows = [acc_name, gen_date, cont, in_url, in_media, hash_cont, hash_num, ret_num, fav_num]
        tw_data.loc[len(tw_data)] = rows

1it [00:00,  5.46it/s]Rate limit exceeded. Sleeping for 900 seconds.
4it [36:07, 665.03s/it]Rate limit exceeded. Sleeping for 822 seconds.
6it [49:50, 530.64s/it]Rate limit exceeded. Sleeping for 900 seconds.
8it [1:04:51, 495.97s/it]Rate limit exceeded. Sleeping for 900 seconds.
Rate limit exceeded. Sleeping for 897 seconds.
9it [1:34:52, 803.19s/it]Rate limit exceeded. Sleeping for 899 seconds.
10it [1:49:51, 827.37s/it]Rate limit exceeded. Sleeping for 901 seconds.
11it [2:05:37, 858.57s/it]Rate limit exceeded. Sleeping for 856 seconds.
13it [2:19:53, 673.35s/it]Rate limit exceeded. Sleeping for 900 seconds.
14it [2:34:54, 726.88s/it]Rate limit exceeded. Sleeping for 900 seconds.
15it [2:49:54, 770.48s/it]Rate limit exceeded. Sleeping for 901 seconds.
16it [3:06:34, 830.94s/it]Rate limit exceeded. Sleeping for 802 seconds.
18it [3:19:56, 646.05s/it]Rate limit exceeded. Sleeping for 900 seconds.
19it [3:34:56, 705.78s/it]Rate limit exceeded. Sleeping for 901 seconds.
20it [3:49:57, 754.91s/it]Rate limit exceeded. Sleeping for 901 seconds.
24it [4:24:02, 524.36s/it]Rate limit exceeded. Sleeping for 705 seconds.
25it [4:35:47, 576.91s/it]Rate limit exceeded. Sleeping for 901 seconds.
26it [4:50:49, 672.14s/it]Rate limit exceeded. Sleeping for 900 seconds.
28it [5:06:14, 576.69s/it]Rate limit exceeded. Sleeping for 876 seconds.
30it [5:20:51, 522.05s/it]Rate limit exceeded. Sleeping for 900 seconds.
32it [5:35:51, 496.16s/it]Rate limit exceeded. Sleeping for 901 seconds.
33it [5:50:55, 579.58s/it]Rate limit exceeded. Sleeping for 898 seconds.
35it [6:05:53, 531.46s/it]Rate limit exceeded. Sleeping for 901 seconds.
36it [6:20:54, 608.61s/it]Rate limit exceeded. Sleeping for 901 seconds.
37it [6:35:56, 675.80s/it]Rate limit exceeded. Sleeping for 900 seconds.
38it [6:50:56, 731.23s/it]Rate limit exceeded. Sleeping for 901 seconds.
39it [7:05:57, 775.58s/it]Rate limit exceeded. Sleeping for 901 seconds.
40it [7:20:58, 809.71s/it]Rate limit exceeded. Sleeping for 901 seconds.
41it [7:52:41, 1115.17s/it]Rate limit exceeded. Sleeping for 850 seconds.
43it [8:06:52, 808.84s/it] Rate limit exceeded. Sleeping for 900 seconds.
44it [8:22:14, 684.88s/it]

tw_data

	Account	gen_date	text	in_url	in_media	hash_text	hash_count	ret_count	fav_count
0	BestBuy	Tue Nov 14 18:59:07 +0000 2017	Gear up.\n\nGet the #StarWarsBattlefrontII Eli...	https://t.co/qaVjKvyqv6	No Media included	[StarWarsBattlefrontII]	1	30	61
1	BestBuy	Tue Nov 14 15:00:01 +0000 2017	.@saradietschy proves that no matter how big o...	No URL included	No Media included	[]	0	21	129
2	BestBuy	Mon Nov 13 18:00:12 +0000 2017	They’ll be dashing like Dasher and dancing lik...	https://t.co/bpibw0wXFH	No Media included	[]	0	19	39
3	BestBuy	Mon Nov 13 15:00:09 +0000 2017	Got a fav song on JAY-Z’s 4:44 album?\nTell us...	https://t.co/AuwgDRsUjB	No Media included	[BestBuyTicketsNY, Sweepstakes]	2	27	93
4	BestBuy	Sun Nov 12 15:00:09 +0000 2017	The AMD Ryzen Processor with Radeon Vega Graph...	https://t.co/2gOQRxYv6h	No Media included	[]	0	26	52
...	...	...	...	...	...	...	...	...	...
2954	riteaid	Mon Aug 07 01:01:52 +0000 2017	The DreamShip takes flight at the @3RiversRega...	https://t.co/wuKidTA8vY	No Media included	[]	0	5	11
2955	riteaid	Sun Aug 06 12:00:02 +0000 2017	Why wait? Get your flu shot now through August...	No URL included	photo	[]	0	4	8
2956	riteaid	Fri Aug 04 13:30:08 +0000 2017	The more points you earn with wellness+Plenti,...	https://t.co/qQhIuoId6G	No Media included	[]	0	5	5
2957	riteaid	Wed Aug 02 20:54:03 +0000 2017	It's never too early to get your flu shot, so ...	https://t.co/uvjriPn3iv	No Media included	[]	0	5	9
2958	riteaid	Tue Jul 25 12:00:03 +0000 2017	Make healthy eating fun with a picnic lunch of...	No URL included	photo	[]	0	2	4

2959 rows × 9 columns

tw_data.to_csv("FinalCheck_cycle25.csv")

PAGINATION SECTION for one account

column_names = ["Account","gen_date", "text", "in_url", "in_media", "hash_text", "hash_count", "ret_count", "fav_count"]
tw_data = pd.DataFrame(columns = column_names)
tw_data

start_time = "2017-07-25"+"T00:00:01Z"
end_time = "2018-03-21"+"T00:00:01Z"
acc_name = "riteaid"

req_counter = 0

for tweet in tqdm(tweepy.Paginator(client.search_all_tweets, query= "from: "+acc_name+ " lang:en -is:retweet -is:reply -is:quote -is:nullcast",
                            start_time=start_time, end_time=end_time, max_results=500).flatten(limit=2000)):
    req_counter += 1
    if req_counter == 900:
        time.sleep(901)
        req_counter = 0
    target_id = tweet.id
    ex_stat = api.get_status(target_id)
    
    gen_date = ex_stat._json['created_at']
    cont = ex_stat._json['text']
    try:
        urls = ex_stat.entities['urls'][0]
        in_url = urls['url']
    except:
        in_url = "No URL included"
    try:
        media = ex_stat.entities['media'][0]
        in_media = media['type']
    except:
        in_media = "No Media included"
    hash_cont = []
    for hashtag in ex_stat.entities['hashtags']:
        hash_cont.append(hashtag['text'])
    hash_num = len(ex_stat.entities['hashtags'])
    ret_num = ex_stat._json['retweet_count']
    fav_num = ex_stat._json['favorite_count']
    
#     for i in tweepy.Paginator(client.search_all_tweets, query="in_reply_to_status_id: "+str(target_id),
#                             start_time=start_time, end_time=end_time, max_results=500).flatten(limit=100000):
#     for i in client.search_all_tweets(query="in_reply_to_status_id: "+str(target_id), start_time=start_time, end_time=end_time, max_results=200):
#         idx += 1
#     rep_num = idx
    
    rows = [acc_name, gen_date, cont, in_url, in_media, hash_cont, hash_num, ret_num, fav_num]
    tw_data.loc[len(tw_data)] = rows

tw_data

tw_data.to_csv("FinalCheck_riteaid.csv")

column_names = ["Account","gen_date", "text", "in_url", "in_media", "hash_text", "hash_count", "ret_count", "fav_count"]
tw_data = pd.DataFrame(columns = column_names)
tw_data

start_time = "2018-11-08"+"T00:00:01Z"
end_time = "2019-10-29"+"T00:00:01Z"
acc_name = "SportsmansWH"

req_counter = 0

for tweet in tqdm(tweepy.Paginator(client.search_all_tweets, query= "from: "+acc_name+ " lang:en -is:retweet -is:reply -is:quote -is:nullcast",
                            start_time=start_time, end_time=end_time, max_results=500).flatten(limit=2000)):
    req_counter += 1
    if req_counter == 900:
        time.sleep(901)
        req_counter = 0
    target_id = tweet.id
    ex_stat = api.get_status(target_id)
    
    gen_date = ex_stat._json['created_at']
    cont = ex_stat._json['text']
    try:
        urls = ex_stat.entities['urls'][0]
        in_url = urls['url']
    except:
        in_url = "No URL included"
    try:
        media = ex_stat.entities['media'][0]
        in_media = media['type']
    except:
        in_media = "No Media included"
    hash_cont = []
    for hashtag in ex_stat.entities['hashtags']:
        hash_cont.append(hashtag['text'])
    hash_num = len(ex_stat.entities['hashtags'])
    ret_num = ex_stat._json['retweet_count']
    fav_num = ex_stat._json['favorite_count']
    
#     for i in tweepy.Paginator(client.search_all_tweets, query="in_reply_to_status_id: "+str(target_id),
#                             start_time=start_time, end_time=end_time, max_results=500).flatten(limit=100000):
#     for i in client.search_all_tweets(query="in_reply_to_status_id: "+str(target_id), start_time=start_time, end_time=end_time, max_results=200):
#         idx += 1
#     rep_num = idx
    
    rows = [acc_name, gen_date, cont, in_url, in_media, hash_cont, hash_num, ret_num, fav_num]
    tw_data.loc[len(tw_data)] = rows

tw_data

tw_data.to_csv("FinalCheck_sportsmansWH.csv")

column_names = ["Account","gen_date", "text", "in_url", "in_media", "hash_text", "hash_count", "ret_count", "fav_count"]
tw_data = pd.DataFrame(columns = column_names)
tw_data

	Account	gen_date	text	in_url	in_media	hash_text	hash_count	ret_count	fav_count

start_time = "2019-04-14"+"T00:00:01Z"
end_time = "2019-09-28"+"T00:00:01Z"
acc_name = "lululemon"

req_counter = 0

for tweet in tqdm(tweepy.Paginator(client.search_all_tweets, query= "from: "+acc_name+ " lang:en -is:retweet -is:reply -is:quote -is:nullcast",
                            start_time=start_time, end_time=end_time, max_results=500).flatten(limit=2000)):
    req_counter += 1
    if req_counter == 900:
        time.sleep(901)
        req_counter = 0
    target_id = tweet.id
    ex_stat = api.get_status(target_id)
    
    gen_date = ex_stat._json['created_at']
    cont = ex_stat._json['text']
    try:
        urls = ex_stat.entities['urls'][0]
        in_url = urls['url']
    except:
        in_url = "No URL included"
    try:
        media = ex_stat.entities['media'][0]
        in_media = media['type']
    except:
        in_media = "No Media included"
    hash_cont = ex_stat.entities['hashtags']
    hash_num = len(ex_stat.entities['hashtags'])
    ret_num = ex_stat._json['retweet_count']
    fav_num = ex_stat._json['favorite_count']
    
#     for i in tweepy.Paginator(client.search_all_tweets, query="in_reply_to_status_id: "+str(target_id),
#                             start_time=start_time, end_time=end_time, max_results=500).flatten(limit=100000):
#     for i in client.search_all_tweets(query="in_reply_to_status_id: "+str(target_id), start_time=start_time, end_time=end_time, max_results=200):
#         idx += 1
#     rep_num = idx
    
    rows = [acc_name, gen_date, cont, in_url, in_media, hash_cont, hash_num, ret_num, fav_num]
    tw_data.loc[len(tw_data)] = rows

51it [00:36,  1.40it/s]

tw_data

	Account	gen_date	text	in_url	in_media	hash_text	hash_count	ret_count	fav_count
0	lululemon	Wed Aug 28 02:30:40 +0000 2019	Vetted by @lululemonmen , our best men's worko...	https://t.co/QZ8sqhYMXr	No Media included	[]	0	46	47
1	lululemon	Mon Aug 26 23:00:11 +0000 2019	We’re going beyond the buzzwords and giving yo...	https://t.co/dloPhlz2EY	No Media included	[]	0	5	38
2	lululemon	Sat Aug 17 04:10:55 +0000 2019	Has anyone seen @craig_mcmorris fanny pack? ht...	https://t.co/zeyycJs9ni	photo	[{'text': 'SeaWheeze', 'indices': [68, 78]}]	1	2	23
3	lululemon	Fri Aug 16 15:55:00 +0000 2019	There are 10,000 people running #SeaWheeze. Bu...	https://t.co/saxcECmR2L	No Media included	[{'text': 'SeaWheeze', 'indices': [32, 42]}]	1	1	25
4	lululemon	Tue Aug 06 23:30:42 +0000 2019	Her relationship with her boobs is complicated...	https://t.co/LO3UDF7F9v	No Media included	[]	0	7	28
5	lululemon	Tue Aug 06 19:07:31 +0000 2019	Where there's boobs, there's truths...and duet...	https://t.co/xzFESvHa57	No Media included	[]	0	5	20
6	lululemon	Wed Jul 31 00:08:23 +0000 2019	Introducing Boob Truth Tuesdays—You’ll laugh, ...	https://t.co/dcCXqjTmrm	No Media included	[]	0	6	31
7	lululemon	Tue Jul 23 23:00:02 +0000 2019	Better, together—our full collection of Men's ...	https://t.co/8JKFYfaeyS	No Media included	[]	0	4	47
8	lululemon	Thu Jul 18 00:00:20 +0000 2019	Sign-up to be the first to know about the new ...	https://t.co/9unfTAajwb	No Media included	[]	0	5	45
9	lululemon	Sun Jul 14 00:10:01 +0000 2019	Need healing? A confidence-boost? Rest? 3 reas...	https://t.co/lkggudeIFi	No Media included	[]	0	10	40
10	lululemon	Sat Jul 06 22:37:57 +0000 2019	Professional quarterback @NickFoles’ secret to...	https://t.co/j6mPl1Wnqn	No Media included	[{'text': 'lululemon', 'indices': [93, 103]}]	1	32	295
11	lululemon	Fri Jul 05 19:02:33 +0000 2019	More time for yoga–ICYMI, Elite Ambassador and...	https://t.co/LlaKEFYaeB	No Media included	[]	0	3	39
12	lululemon	Sun Jun 30 20:43:48 +0000 2019	In honour of 50 years of #pride we’ve asked so...	https://t.co/sIUMuGrBYb	No Media included	[{'text': 'pride', 'indices': [25, 31]}]	1	6	70
13	lululemon	Sat Jun 29 22:59:01 +0000 2019	In any profession, work stress is real. Here a...	https://t.co/Oyutp9qM0i	No Media included	[{'text': 'Chicago', 'indices': [56, 64]}]	1	12	87
14	lululemon	Sat Jun 22 02:00:00 +0000 2019	“Yoga allows me to enjoy the present moment.” ...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	4	36
15	lululemon	Sat Jun 22 01:00:12 +0000 2019	Peace Coleman from I Grow Chicago shares what ...	https://t.co/nfLQG2spHP	No Media included	[]	0	1	15
16	lululemon	Sat Jun 22 01:00:00 +0000 2019	“Yoga is an intimate date with myself.” - Elit...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	9	52
17	lululemon	Sat Jun 22 00:30:00 +0000 2019	“Yoga has taught me to make peace with the unk...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	5	46
18	lululemon	Sat Jun 22 00:00:00 +0000 2019	"Yoga is a daily dose of energy, strength, goo...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	2	32
19	lululemon	Fri Jun 21 23:30:00 +0000 2019	“Yoga puts me completely in control of my mood...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	0	25
20	lululemon	Fri Jun 21 23:00:12 +0000 2019	Adria Moses from @DETBoxingGym took her trauma...	https://t.co/jFcL3YEzEX	No Media included	[]	0	0	14
21	lululemon	Fri Jun 21 23:00:00 +0000 2019	“Yoga helps me to be mindful.” - Elite Ambassa...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	0	8
22	lululemon	Fri Jun 21 22:30:00 +0000 2019	“Yoga turned me from an inflexible jock into a...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	1	18
23	lululemon	Fri Jun 21 22:00:00 +0000 2019	“Yoga helps create more space in my mind and b...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	3	23
24	lululemon	Fri Jun 21 21:30:00 +0000 2019	"Yoga clears my head and heals my body.” - Eli...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	2	28
25	lululemon	Fri Jun 21 21:00:00 +0000 2019	“Yoga is the tool I use to delete my back pain...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	1	15
26	lululemon	Fri Jun 21 20:30:00 +0000 2019	“Yoga exposed me completely.” - Elite Ambassad...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	0	19
27	lululemon	Fri Jun 21 20:07:14 +0000 2019	See what helped @AlexMazerolle discover her pa...	https://t.co/0Zo79KXgZO	No Media included	[]	0	1	13
28	lululemon	Fri Jun 21 20:00:00 +0000 2019	“Yoga has extended my career.” - Elite Ambassa...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	3	15
29	lululemon	Fri Jun 21 19:00:00 +0000 2019	“Yoga has taught me to embrace the moment.” - ...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	2	15
30	lululemon	Fri Jun 21 18:30:00 +0000 2019	“Yoga makes me the most authentic and courageo...	https://t.co/QSSYlBvucy	No Media included	[]	0	2	11
31	lululemon	Fri Jun 21 18:00:00 +0000 2019	“Yoga helped me win medals at the highest leve...	https://t.co/XbVZx7Tduj	No Media included	[]	0	4	31
32	lululemon	Fri Jun 21 17:30:00 +0000 2019	“Yoga is my main method of recovery.” - Elite ...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	0	16
33	lululemon	Fri Jun 21 17:00:01 +0000 2019	“Yoga has taught me to breathe through my chal...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	6	18
34	lululemon	Fri Jun 21 16:30:00 +0000 2019	It's so much more than just poses. How has yog...	No URL included	No Media included	[{'text': 'internationalyogaday', 'indices': [...	2	9	33
35	lululemon	Fri Jun 21 16:00:28 +0000 2019	It’s more powerful than you think: https://t.c...	https://t.co/HHsX6D1roo	photo	[{'text': 'internationalyogaday', 'indices': [...	2	5	21
36	lululemon	Tue Jun 18 23:00:12 +0000 2019	Gahh, we’re so excited! Made with good ingredi...	https://t.co/rZhixCe3eg	No Media included	[]	0	21	176
37	lululemon	Tue Jun 18 16:52:12 +0000 2019	Eau de burpees, sweaty hair, hot yoga face...i...	https://t.co/LoLIne0jN1	No Media included	[]	0	1	61
38	lululemon	Wed Jun 05 21:13:00 +0000 2019	323,577 collective kilometers down, how may mo...	https://t.co/mDHcER2NFa	No Media included	[{'text': 'GlobalRunningDay', 'indices': [65, ...	1	3	16
39	lululemon	Thu May 30 23:00:06 +0000 2019	Grab your run crew, hit the pavement and crush...	https://t.co/i7xaAwpvpB	No Media included	[]	0	7	38
40	lululemon	Fri May 24 16:01:01 +0000 2019	On #GlobalRunningDay, let’s show the world the...	https://t.co/cuEbAnMZxP	No Media included	[{'text': 'GlobalRunningDay', 'indices': [3, 2...	1	8	46
41	lululemon	Tue May 21 22:18:01 +0000 2019	Vancouver's own, @robbiedxc is our newest Glob...	https://t.co/xE6zPUg2Jh	No Media included	[]	0	5	64
42	lululemon	Sun May 12 12:51:05 +0000 2019	Happy #MothersDay We’re celebrating our global...	https://t.co/4XxxgbNmCV	No Media included	[{'text': 'MothersDay', 'indices': [6, 17]}]	1	1	31
43	lululemon	Sat May 11 18:30:20 +0000 2019	Recover from your run to keep moving and feeli...	https://t.co/GHNABwSQ3i	No Media included	[]	0	2	35
44	lululemon	Sat May 04 18:20:20 +0000 2019	What sound runners eat and when? Find out in ...	https://t.co/4g4RkkhYSq	No Media included	[]	0	3	35
45	lululemon	Fri May 03 22:45:02 +0000 2019	Congratulations Sun Choe, lululemon’s Chief Pr...	https://t.co/tyQY8tKQ86	No Media included	[]	0	11	73
46	lululemon	Wed May 01 04:18:02 +0000 2019	Reflecting on growing up, falling down, and fo...	https://t.co/4uixGLRtH1	No Media included	[]	0	11	46
47	lululemon	Sat Apr 27 18:06:20 +0000 2019	Learn how trail running can help build strengt...	https://t.co/KcjS0pbKm3	No Media included	[]	0	7	32
48	lululemon	Thu Apr 18 00:32:00 +0000 2019	Running is how he brings people together to ex...	https://t.co/jjsd4scINo	No Media included	[]	0	3	23
49	lululemon	Wed Apr 17 20:14:00 +0000 2019	Our newest Global Run Ambassador opens up and ...	https://t.co/8nE3o8mv3V	No Media included	[]	0	7	29
50	lululemon	Sun Apr 14 20:18:19 +0000 2019	Learn how the track can improve pace, efficien...	https://t.co/yY2xmADYQl	No Media included	[]	0	3	27

tw_data.to_csv("FinalCheck_lululemon.csv")

Share on

Twitter Facebook LinkedIn

Twitter tweets scrapping model (Fall 2022)

Search_all_tweets version

Share on

You may also enjoy

Journey to the academic researcher

Text Analysis Code: whether a sentence is which ESG issues (Fall 2022)

How to scrappe Google Trends with Python (Fall 2022)

Binomial Asset Pricing Model Based on TF theory (Summer 2022)