385.714285714286
https://styles.redditmedia.com/t5_ba4ix/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N181MDI1MDM2_rare_51cbcb62-c7c3-42d3-bd35-4af683ebfea7-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0ad87871dcba1f8ca7ffbb4805874a3be2bf6385
paglaindian
paglaindian
503.201758267979
2589.45727539063
8966.212890625
3
2
40
0.025075
0.205433
0.002593
0
0.5
3
paglaindian
9/7/2016 9:02:47 AM
0
178
218
0
False
False
False
False
True
False
t2_118kaw
False
False
False
https://styles.redditmedia.com/t5_ba4ix/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N181MDI1MDM2_rare_51cbcb62-c7c3-42d3-bd35-4af683ebfea7-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0ad87871dcba1f8ca7ffbb4805874a3be2bf6385
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/paglaindian
3
3
3.125
1
1.04166666666667
0
0
47
48.9583333333333
96
Posted RepliedTo
RepliedTo Posted
webscraping Octoparse_ideas
Octoparse_ideas webscraping
app zoom url apps marketplace works vqdwybqsrg each names g6y4gtvmncq
app zoom url apps marketplace works vqdwybqsrg each names g6y4gtvmncq
zoom,apps marketplace,zoom each,app app,names url,each vqdwybqsrg,g6y4gtvmncq apps,vqdwybqsrg app,using pagination,really really,works
zoom,apps marketplace,zoom each,app app,names url,each vqdwybqsrg,g6y4gtvmncq apps,vqdwybqsrg app,using pagination,really really,works
100
https://styles.redditmedia.com/t5_2ww6fz/styles/profileIcon_p0fs6poxuue81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=ad8a05656e188ec6a9261f421e194c98cda44f0d
shalashaska02
shalashaska02
1
2507.9248046875
9927.7822265625
1
1
0
0.017115
0.042652
0.002203
0
1
4
shalashaska02
7/27/2020 4:49:14 AM
0
1
40
0
False
False
False
False
True
False
t2_63161xy8
False
False
False
https://styles.redditmedia.com/t5_2ww6fz/styles/profileIcon_p0fs6poxuue81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=ad8a05656e188ec6a9261f421e194c98cda44f0d
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/shalashaska02
3
0
0
0
0
0
0
33
64.7058823529412
51
Commented
Commented
webscraping
webscraping
api 30 url zoom filter apps pagenum app v1 marketplace
api 30 url zoom filter apps pagenum app v1 marketplace
api,v1 filter,pagenum pagenum,pagesize v1,apps apps,filter zoom,api pagesize,30 marketplace,zoom app,calling 30,marketplace
api,v1 filter,pagenum pagenum,pagesize v1,apps apps,filter zoom,api pagesize,30 marketplace,zoom app,calling 30,marketplace
1000
https://styles.redditmedia.com/t5_3rg12u/styles/profileIcon_snooa86f60bb-c42c-4b52-8254-78fc7cc8ce4d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7fc35fb63d4295868cf1533a3608e4a15851e47b
octoparseideas
octoparseideas
5173.67811016018
2654.49462890625
7746.06640625
18
6
412
0.04313
0.741375
0.007556
0
0.222222222222222
5
Octoparseideas
1/22/2021 1:59:17 AM
0
370
21
0
False
False
False
False
True
False
t2_9xi78ijs
False
False
True
https://styles.redditmedia.com/t5_3rg12u/styles/profileIcon_snooa86f60bb-c42c-4b52-8254-78fc7cc8ce4d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7fc35fb63d4295868cf1533a3608e4a15851e47b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Octoparseideas
3
1180
3.35637284182382
275
0.782205535170805
0
0
18223
51.8332053360639
35157
Posted RepliedTo Commented
RepliedTo Posted Commented
u_Octoparseideas Octoparse_ideas webscraping SaaS content_marketing
Octoparse_ideas u_Octoparseideas webscraping content_marketing SaaS
data octoparse web scraping blog job utm use scrape more
web job data scraping proxy ip google # website scraper
web,scraping octoparse,blog utm,_source utm,_campaign utm,_medium data,collection reddit,octoparse _campaign,reddit utm_campaign,reddit lead,generation
web,scraping data,collection octoparse,blog niche,job lead,generation service,octoparse job,board job,boards re,service proxy,server
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
logical_bowl_5442
logical_bowl_5442
1
2664.796875
8369.0439453125
1
1
0
0.023961
0.153925
0.002133
0
1
6
Logical_Bowl_5442
9/29/2021 4:01:35 PM
0
1
4
0
False
False
False
False
True
False
t2_ey9wbmhx
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Logical_Bowl_5442
3
1
4.16666666666667
0
0
0
0
10
41.6666666666667
24
Commented
Commented
Octoparse_ideas
Octoparse_ideas
industry web hight see save development potential far country amazing
industry web hight see save development potential far country amazing
hight,potential development,see far,web country,industry see,hight article,save amazing,article save,far potential,country web,development
hight,potential development,see far,web country,industry see,hight article,save amazing,article save,far potential,country web,development
100
https://styles.redditmedia.com/t5_45c5da/styles/profileIcon_6k64rcqpc0r61.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d4b6a00861704ca3853e5748f4c246f083c37ff1
mountproxies
mountproxies
1
6293.470703125
1602.40380859375
0
1
0
0.004401
0
0.002208
0
0
7
MountProxies
3/23/2021 5:05:27 PM
0
1
24
0
False
False
False
False
True
False
t2_b3canq9z
False
False
True
https://styles.redditmedia.com/t5_45c5da/styles/profileIcon_6k64rcqpc0r61.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d4b6a00861704ca3853e5748f4c246f083c37ff1
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/MountProxies
37
1
25
0
0
0
0
1
25
4
RepliedTo
RepliedTo
webscraping
webscraping
good advice
good advice
good,advice
good,advice
128.571428571429
https://styles.redditmedia.com/t5_e1s9m/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV85MjY4MjI_rare_c51dd4c4-f870-487c-bb10-eb66106aea63-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a1bfd0bbc189fd49405b54d7566d721f3dc99e11
mental_diarrhea
mental_diarrhea
51.2201758267979
6532.9404296875
981.297607421875
2
2
4
0.007335
0
0.002705
0.166666666666667
0.333333333333333
8
mental_diarrhea
4/28/2014 6:23:13 AM
0
693
6987
0
False
False
False
False
True
False
t2_gc38o
False
False
False
https://styles.redditmedia.com/t5_e1s9m/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV85MjY4MjI_rare_c51dd4c4-f870-487c-bb10-eb66106aea63-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a1bfd0bbc189fd49405b54d7566d721f3dc99e11
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/mental_diarrhea
37
4
2.13903743315508
7
3.74331550802139
0
0
75
40.1069518716578
187
RepliedTo Commented
Commented RepliedTo
webscraping
webscraping
apis 20 python tv more selenium possible plug neighbors' simultaneous
apis 20 python tv selenium plug neighbors' simultaneous something large
simply,efficient hundreds,uniqueness hogs,selenium quantities,scrape simultaneous,latter bored,enough pretend,wife calls,seem 20,reqs more,wanting
simply,efficient hundreds,uniqueness hogs,selenium quantities,scrape simultaneous,latter bored,enough pretend,wife calls,seem 20,reqs more,wanting
100
https://styles.redditmedia.com/t5_wl2yz/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV80MDY5MDk_rare_6d71bbd2-1d83-408a-9844-93bd4ba23524-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2bd2dc04dcc2167a3ad62483d74e4ec3dc18e9c1
bubbajoe2000
bubbajoe2000
1
6832.99609375
819.943542480469
3
1
0
0.005501
0
0.002509
1
0
9
BubbaJoe2000
2/15/2019 12:16:54 AM
0
196
1300
0
False
False
False
False
True
False
t2_2ow42pz1
False
False
False
https://styles.redditmedia.com/t5_wl2yz/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV80MDY5MDk_rare_6d71bbd2-1d83-408a-9844-93bd4ba23524-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2bd2dc04dcc2167a3ad62483d74e4ec3dc18e9c1
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/BubbaJoe2000
37
2
1.8348623853211
0
0
0
0
49
44.954128440367
109
Posted
Posted
webscraping
webscraping
enter automate 5g home #checkavailability verizon need done suggestions projects
enter automate 5g home #checkavailability verizon need done suggestions projects
5g,home verizon,5g home,#checkavailability using,mechanical work,done haven't,work projects,past results,site #checkavailability,verizon go,verizon
5g,home verizon,5g home,#checkavailability using,mechanical work,done haven't,work projects,past results,site #checkavailability,verizon go,verizon
100
https://styles.redditmedia.com/t5_2kwqx4/styles/profileIcon_snoo857191c0-1ef4-4a45-bc93-1445fe28ea4b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5580868a233a69973c706e40cdb1f2b51fca8a59
meaveready
meaveready
1
6645.54443359375
71.2179489135742
1
2
0
0.005501
0
0.002334
0.5
0.5
10
Meaveready
4/16/2020 1:31:05 AM
0
576
2971
0
False
False
False
False
True
False
t2_4ba7h526
False
False
False
https://styles.redditmedia.com/t5_2kwqx4/styles/profileIcon_snoo857191c0-1ef4-4a45-bc93-1445fe28ea4b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5580868a233a69973c706e40cdb1f2b51fca8a59
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Meaveready
37
3
1.77514792899408
10
5.91715976331361
0
0
100
59.1715976331361
169
Commented RepliedTo
Commented RepliedTo
webscraping
webscraping
null false open fridge cat tx addressline2 window yourself verifye911address
null false open fridge cat tx addressline2 window yourself verifye911address
open,fridge verifye911address,false installtype,null doable,automate null,addressdescriptorlist eventcorrelationid,null null,addressfromaccount selenium,similar talk,api false,reservation
open,fridge verifye911address,false installtype,null doable,automate null,addressdescriptorlist eventcorrelationid,null null,addressfromaccount selenium,similar talk,api false,reservation
100
https://styles.redditmedia.com/t5_4mtivi/styles/profileIcon_snoo5f9a7a0a-df45-4cb7-893a-569b6c04998d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=520d206453f1e62ff621aec736e1e54b734e8d32
dblacklabel31
dblacklabel31
1
503.528503417969
5627.2861328125
1
1
0
0
0
0.002439
0
0
11
DBlackLabel31
6/19/2021 11:50:10 PM
0
10
1
0
False
False
False
False
True
False
t2_ctnl4rsk
False
False
True
https://styles.redditmedia.com/t5_4mtivi/styles/profileIcon_snoo5f9a7a0a-df45-4cb7-893a-569b6c04998d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=520d206453f1e62ff621aec736e1e54b734e8d32
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/DBlackLabel31
1
Posted
Posted
yournamewebsite
yournamewebsite
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
bobhogan
bobhogan
1
7452.560546875
1167.97436523438
1
1
0
0.004401
0
0.002192
0
1
12
BobHogan
8/16/2012 7:52:24 PM
0
29896
118100
0
False
False
False
False
True
False
t2_8ox92
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/BobHogan
28
2
7.69230769230769
2
7.69230769230769
0
0
8
30.7692307692308
26
Commented RepliedTo
Commented RepliedTo
learnpython
learnpython
throwing using try soup work scrape errors unexpected requests results
throwing using try soup work scrape errors unexpected requests results
beautiful,soup errors,giving soup,requests giving,unexpected requests,scrape unexpected,results try,using throwing,errors work,throwing using,beautiful
beautiful,soup errors,giving soup,requests giving,unexpected requests,scrape unexpected,results try,using throwing,errors work,throwing using,beautiful
142.857142857143
https://styles.redditmedia.com/t5_27tuh5/styles/profileIcon_snoo4b2079da-e99d-4e80-993c-cd94ddfc1090-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=11e19bbd8083205762b71a959f6cfd8009dd1e7c
juanchi_parra
juanchi_parra
76.3302637401968
7209.22607421875
1762.80407714844
4
3
6
0.007335
0
0.003179
0
0.666666666666667
13
juanchi_parra
11/4/2019 8:11:36 PM
0
204
28
0
False
False
False
False
True
False
t2_4oz8zgyn
False
False
False
https://styles.redditmedia.com/t5_27tuh5/styles/profileIcon_snoo4b2079da-e99d-4e80-993c-cd94ddfc1090-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=11e19bbd8083205762b71a959f6cfd8009dd1e7c
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/juanchi_parra
28
2
1.70940170940171
0
0
0
0
53
45.2991452991453
117
RepliedTo Posted
Posted RepliedTo
webscraping learnpython
webscraping learnpython
gaming msi extract arena battlefy need data advice google tools
gaming msi arena battlefy advice google tools methods hey chrome
gaming,arena msi,gaming battlefy,msi need,extract arena,battlefy advice,suggestion tools,usually data,battlefy scraping,google octoparse,web
gaming,arena msi,gaming battlefy,msi arena,battlefy advice,suggestion tools,usually data,battlefy scraping,google octoparse,web web,scraping
100
https://styles.redditmedia.com/t5_4kzfn2/styles/profileIcon_y5wla05xzi471.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=fb0d4ead20d4cd267b42367a2c66c1115044164a
uaskmebefore
uaskmebefore
1
819.451293945313
5627.2861328125
1
1
0
0
0
0.002439
0
0
14
uaskmebefore
6/10/2021 11:41:46 PM
0
3376
4
0
False
False
False
False
True
False
t2_cnfkhf58
False
False
True
https://styles.redditmedia.com/t5_4kzfn2/styles/profileIcon_y5wla05xzi471.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=fb0d4ead20d4cd267b42367a2c66c1115044164a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/uaskmebefore
1
0
0
0
0
0
0
39
90.6976744186046
43
Posted
Posted
TimedNews
TimedNews
amazon twitter html 抓取 脸书 article 07 在内的诸多网站上 octoparse 公司自2015年3月25日以来运营一个名为
amazon twitter html 抓取 脸书 article 07 在内的诸多网站上 octoparse 公司自2015年3月25日以来运营一个名为
时刻新闻,timednews 抓取,用户帐户个人资料和其他信息 其中一起被诉讼的对象是中国一家国家高新企业的美国子公司,meta公司在诉状中说 公司自2015年3月25日以来运营一个名为,八爪鱼 7月5日,分别对两家数据采集网站提起诉讼 20636,html 位于加利福尼亚州的八爪鱼数据,octopus 和亚马逊,amazon amazon,在内的诸多网站上 母公司meta,元宇宙
时刻新闻,timednews 抓取,用户帐户个人资料和其他信息 其中一起被诉讼的对象是中国一家国家高新企业的美国子公司,meta公司在诉状中说 公司自2015年3月25日以来运营一个名为,八爪鱼 7月5日,分别对两家数据采集网站提起诉讼 20636,html 位于加利福尼亚州的八爪鱼数据,octopus 和亚马逊,amazon amazon,在内的诸多网站上 母公司meta,元宇宙
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
rectangulau
rectangulau
2863.55002212748
395.862762451172
1279.18981933594
5
2
228
0.026794
1E-06
0.003236
0
0.25
15
Rectangulau
2/14/2020 11:18:27 AM
0
23
32
0
False
False
False
False
True
False
t2_5olhjohq
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Rectangulau
2
7
6.36363636363636
0
0
0
0
49
44.5454545454545
110
Posted RepliedTo Commented
Posted Commented RepliedTo
webscraping learnprogramming
webscraping learnprogramming
tweets make sense 1620945297000 help standard octoparse's indicated perfect useful
tweets make 1620945297000 help standard octoparse's indicated perfect useful hi
make,sense tweets,exported template,sure sense,indicated anyone,insights standard,template using,octoparse's exported,excel hi,drawn insights,make
make,sense tweets,exported template,sure sense,indicated anyone,insights standard,template using,octoparse's exported,excel hi,drawn insights,make
100
https://styles.redditmedia.com/t5_18gg2a/styles/profileIcon_snoo41970676-5cb7-4bf5-9d48-2ee3a4ee0e9f-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3aa0e49c3d099eb8d7884bba8c35a9ab360e5765
firwolf
firwolf
1
39.312629699707
1115.64416503906
1
1
0
0.021146
0
0.00217
0
1
16
firwolf
1/28/2017 4:55:52 PM
0
56
1069
0
False
False
False
False
True
False
t2_14uwml
False
False
False
https://styles.redditmedia.com/t5_18gg2a/styles/profileIcon_snoo41970676-5cb7-4bf5-9d48-2ee3a4ee0e9f-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3aa0e49c3d099eb8d7884bba8c35a9ab360e5765
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/firwolf
2
3
5.55555555555556
0
0
0
0
25
46.2962962962963
54
Commented
Commented
webscraping
webscraping
time milliseconds epoch convert readable stackoverflow unix stored dates uses
time milliseconds epoch convert readable stackoverflow unix stored dates uses
epoch,time uses,milliseconds convert,human unix,epoch human,readable milliseconds,unix convert,epoch dates,computer stored,number milliseconds,convert
epoch,time uses,milliseconds convert,human unix,epoch human,readable milliseconds,unix convert,epoch dates,computer stored,number milliseconds,convert
1000
https://styles.redditmedia.com/t5_c4jmi/styles/profileIcon_zn259frjj9w01.png?width=256&height=256&crop=256:256,smart&v=enabled&s=fc21412797ae17efad613d754a0bb6c71004fea3
mdaniel
mdaniel
3692.18292326965
3298.38916015625
5531.35498046875
3
3
294
0.018237
0
0.002705
0
0.5
17
mdaniel
7/2/2007 10:40:13 AM
0
427
5751
0
False
False
False
False
True
False
t2_22tq0
False
True
False
https://styles.redditmedia.com/t5_c4jmi/styles/profileIcon_zn259frjj9w01.png?width=256&height=256&crop=256:256,smart&v=enabled&s=fc21412797ae17efad613d754a0bb6c71004fea3
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/mdaniel
5
5
1.80505415162455
4
1.44404332129964
0
0
126
45.4873646209386
277
Commented RepliedTo
RepliedTo Commented
scrapinghub webscraping
webscraping scrapinghub
xhr html match chrome request xpath those one exercise network
xhr match chrome xpath one exercise network physically scraping line
chrome,developer physically,impossible xpath,match developer,tools tools,network match,xpath tend,more network,xhr filter,duckduckgo dealing,json
chrome,developer physically,impossible xpath,match developer,tools tools,network match,xpath tend,more network,xhr filter,duckduckgo dealing,json
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
maikelnait
maikelnait
1
3322.16186523438
6419.705078125
2
1
0
0.014958
0
0.002351
0
0
18
maikelnait
4/16/2007 10:18:04 PM
0
20
0
0
False
False
False
False
True
False
t2_1ig4l
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/maikelnait
5
1
1.06382978723404
1
1.06382978723404
0
0
46
48.936170212766
94
Posted
Posted
webscraping
webscraping
alicante idealista viviendas send desc ordenado venta alacant phone publicacion
alicante idealista viviendas send desc ordenado venta alacant phone publicacion
venta,viviendas publicacion,desc ordenado,fecha alicante,ordenado alacant,alicante idealista,venta viviendas,alicante fecha,publicacion alicante,alacant easy,way
venta,viviendas publicacion,desc ordenado,fecha alicante,ordenado alacant,alicante idealista,venta viviendas,alicante fecha,publicacion alicante,alacant easy,way
442.857142857143
https://styles.redditmedia.com/t5_35h2p0/styles/profileIcon_snooec5612ca-21f6-4115-9ba1-e2ad62f955ed-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=151a9d25113ea379a739fd62a64863b88b03e939
joachimbrnd
joachimbrnd
603.642109921575
4262.20166015625
3706.73193359375
3
3
48
0.011643
0
0.002425
0
1
19
joachimbrnd
9/21/2020 1:54:40 PM
0
1900
5
0
False
False
False
False
True
False
t2_85vmqrvh
False
False
False
https://styles.redditmedia.com/t5_35h2p0/styles/profileIcon_snooec5612ca-21f6-4115-9ba1-e2ad62f955ed-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=151a9d25113ea379a739fd62a64863b88b03e939
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/joachimbrnd
10
4
2.05128205128205
5
2.56410256410256
0
0
83
42.5641025641026
195
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
fiverr airtable thanks people using data sir scrape missed thank
airtable people sir scrape missed tried thanks using data thank
until,figure people,brand thanks,again people,yes euros,fiverr airtable,frame hello,everyone paid,guy scrape,thanks sure,sir
until,figure people,brand thanks,again people,yes euros,fiverr airtable,frame hello,everyone paid,guy scrape,thanks sure,sir
557.142857142857
https://styles.redditmedia.com/t5_24r4p5/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV81NjU4MTM_rare_ce0e8cbe-3fca-4026-b1e1-3191c3445aa9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=86f9e7a891b3302f44b46a9cd37fc0a2eb2fc20c
sudodoyou
sudodoyou
804.522813228766
4087.8515625
3466.64501953125
2
3
64
0.012225
0
0.002808
0
0.25
20
sudodoyou
9/12/2019 10:27:51 PM
0
2475
4830
0
False
False
False
False
True
False
t2_4kqc2c7p
False
False
False
https://styles.redditmedia.com/t5_24r4p5/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV81NjU4MTM_rare_ce0e8cbe-3fca-4026-b1e1-3191c3445aa9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=86f9e7a891b3302f44b46a9cd37fc0a2eb2fc20c
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/sudodoyou
10
6
2.29885057471264
2
0.766283524904215
0
0
127
48.6590038314176
261
Commented RepliedTo
RepliedTo Commented
webscraping
webscraping
airtable colab search scraping google indie banditelol blob url yellowpages
airtable colab search google indie banditelol blob url yellowpages part
notebook,airtable labtek,indie github,banditelol airtable,part ipynb#scrollto,oqkwxl5kwzh5 master,notebook scraping,airtable medium,labtek google,github indie,scraping
notebook,airtable labtek,indie github,banditelol airtable,part ipynb#scrollto,oqkwxl5kwzh5 master,notebook scraping,airtable medium,labtek google,github indie,scraping
428.571428571429
https://styles.redditmedia.com/t5_1uyemg/styles/profileIcon_snoo840ed39b-ce3f-4f31-9a60-8bb6be8020fc-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b2e7f0b3184ece63f885ca46f336fab7414902a6
bushcat69
bushcat69
578.532022008176
4436.77001953125
3819.544921875
2
3
46
0.010187
0
0.002506
0
0.666666666666667
21
bushcat69
5/2/2013 1:53:17 PM
0
1959
20737
0
False
False
False
False
True
False
t2_bjc70
False
False
True
https://styles.redditmedia.com/t5_1uyemg/styles/profileIcon_snoo840ed39b-ce3f-4f31-9a60-8bb6be8020fc-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b2e7f0b3184ece63f885ca46f336fab7414902a6
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/bushcat69
10
1
0.202839756592292
2
0.405679513184584
0
0
377
76.4705882352941
493
Commented
Commented
webscraping
webscraping
' 'x airtable x data 'sec start headers auth_json tracer
' 'x airtable x 'sec start headers auth_json tracer application
'ot,tracer ',' 'x,airtable 'no,cache' 'sec,fetch chrome,97 'sec,ch start,x 9','cache khtml,gecko
'ot,tracer ',' 'x,airtable 'no,cache' 'sec,fetch chrome,97 'sec,ch start,x 9','cache khtml,gecko
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
old-medicine2826
old-medicine2826
1
2473.12158203125
8502.671875
0
1
0
0.023961
0.153925
0.002133
0
0
22
Old-Medicine2826
9/18/2021 1:18:02 AM
0
1
-1
0
False
False
False
False
True
False
t2_elbizq4m
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Old-Medicine2826
3
2
13.3333333333333
0
0
0
0
8
53.3333333333333
15
Commented
Commented
Octoparse_ideas
Octoparse_ideas
degree email tool 1st getting good connections free including data
degree email tool 1st getting good connections free including data
getting,1st degree,connections tool,getting good,free free,tool data,including including,email 1st,degree connections,data
getting,1st degree,connections tool,getting good,free free,tool data,including including,email 1st,degree connections,data
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
mr_nice_
mr_nice_
1
4778.650390625
7249.9873046875
1
1
0
0.012158
1E-06
0.002209
0
1
23
Mr_Nice_
9/23/2013 2:44:21 AM
0
2190
25310
0
False
False
False
False
True
False
t2_d9547
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Mr_Nice_
6
4
5.88235294117647
2
2.94117647058824
0
0
24
35.2941176470588
68
RepliedTo Commented
Commented RepliedTo
webscraping
webscraping
well several selenium hell chrome scraping write octoparse detection plugins
well several selenium hell chrome scraping write octoparse detection plugins
detection,against selenium,write plugins,work chrome,plugins very,well free,chrome bot,detection write,script tried,several script,tell
detection,against selenium,write plugins,work chrome,plugins very,well free,chrome bot,detection write,script tried,several script,tell
557.142857142857
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
ibroughtashrubbery
ibroughtashrubbery
804.522813228766
4595.2685546875
7689.21240234375
3
3
64
0.014238
3E-06
0.002712
0
1
24
ibroughtashrubbery
1/26/2021 9:29:34 AM
0
701
1477
0
False
False
False
False
True
False
t2_9zzcbud3
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ibroughtashrubbery
6
8
5.79710144927536
5
3.6231884057971
0
0
51
36.9565217391304
138
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
free need allows thanks octoparse run happy user paid loading
octoparse happy free need allows thanks run user paid loading
allows,free hand,loading noob,sure point,click tool,long clear,error need,check error,noob services,query free,scrap
allows,free hand,loading noob,sure point,click tool,long clear,error need,check error,noob services,query free,scrap
985.714285714286
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
awebscrapingguy
awebscrapingguy
1557.82545063073
4412.8154296875
8176.4970703125
2
2
124
0.016959
9E-06
0.002295
0
1
25
awebscrapingguy
12/14/2020 8:51:52 PM
0
1
84
0
False
False
False
False
True
False
t2_3tlhlyln
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/awebscrapingguy
6
4
2.16216216216216
12
6.48648648648649
0
0
75
40.5405405405405
185
Commented RepliedTo
RepliedTo Commented
webscraping
webscraping
issue probably help provide explain minimum error request fake questions
probably minimum request fake questions code know issue help provides
expect,such message,wrong questions,answers minimum,reproducible anyone,companies contact,support massively,spammed explain,issue opinion,probably context,explain
expect,such message,wrong questions,answers minimum,reproducible anyone,companies contact,support massively,spammed explain,issue opinion,probably context,explain
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
medkhalilbenahmed
medkhalilbenahmed
1
2083.142578125
6288.90087890625
1
1
0
0
0
0.002439
0
0
26
medkhalilbenahmed
2/24/2022 10:02:52 PM
0
1
0
0
False
False
False
False
True
False
t2_5zaoh3r9
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/medkhalilbenahmed
1
0
0
0
0
0
0
1
100
1
Posted
Posted
u_medkhalilbenahmed
u_medkhalilbenahmed
removed
removed
100
https://styles.redditmedia.com/t5_s8wpz/styles/profileIcon_snoo2e5b3bef-6d61-41c7-9a29-1177762fa582-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=32a0a0b5e780bba82647d9c3bc86030bda766efa
fix_riven
fix_riven
1
4523.5107421875
4136.45849609375
2
1
0
0.007409
0
0.002377
0
0
27
Fix_Riven
12/3/2018 10:50:48 PM
0
2958
18063
0
False
False
False
False
True
False
t2_2pu2lj9n
False
False
False
https://styles.redditmedia.com/t5_s8wpz/styles/profileIcon_snoo2e5b3bef-6d61-41c7-9a29-1177762fa582-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=32a0a0b5e780bba82647d9c3bc86030bda766efa
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Fix_Riven
10
0
0
2
2.8169014084507
0
0
28
39.4366197183099
71
Posted
Posted
webscraping
webscraping
ish octoparse extract match regular thing info live stats over
ish octoparse extract match regular thing info live stats over
live,ish octoparse,know problem,idea wanted,live extract,info check,10 stats,updates know,set counter,kills tried,octoparse
live,ish octoparse,know problem,idea wanted,live extract,info check,10 stats,updates know,set counter,kills tried,octoparse
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
altruistic_olives
altruistic_olives
1
8159.57568359375
856.395812988281
0
1
0
0.002445
0
0.002269
0
0
28
Altruistic_Olives
8/9/2021 11:59:01 PM
0
6424
1074
0
False
False
False
False
True
False
t2_dtmvurq1
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Altruistic_Olives
72
0
0
0
0
0
0
3
60
5
Commented
Commented
webscraping
webscraping
still re looking
still re looking
still,looking re,still
still,looking re,still
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
national_tart6235
national_tart6235
1
8159.57568359375
332.943908691406
2
1
0
0.002445
0
0.002609
0
0
29
National_Tart6235
10/13/2021 6:00:06 PM
0
1
0
0
False
False
False
False
True
False
t2_ffxph4h8
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/National_Tart6235
72
6
3.84615384615385
4
2.56410256410256
0
0
65
41.6666666666667
156
Posted
Posted
webscraping
webscraping
data yelp hesitant python quite academic projects anyone something project
data yelp hesitant python quite academic projects anyone something project
learning,python those,academic project,hesitant hello,friends seems,yelp used,octoparse inexperienced,webscraping work,thanks invite,those scrapers,data
learning,python those,academic project,hesitant hello,friends seems,yelp used,octoparse inexperienced,webscraping work,thanks invite,those scrapers,data
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
pythonhelperbot
pythonhelperbot
1
5623.51123046875
5719.27783203125
0
1
0
0.015774
0
0.002178
0
0
30
pythonHelperBot
6/16/2017 4:10:50 AM
0
1
4502
0
False
False
False
False
True
False
t2_47mvs2w
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/pythonHelperBot
12
8
4.3010752688172
4
2.1505376344086
0
0
86
46.2365591397849
186
Commented
Commented
Python
Python
bot python learnpython those discord help faq github better crakenotsnowman
bot python learnpython those discord help faq github better crakenotsnowman
crakenotsnowman,redditpythonhelper github,crakenotsnowman others,readme faster,show code,tried stuck,sure reddit,learnpython faq,md question,python hard,tell
crakenotsnowman,redditpythonhelper github,crakenotsnowman others,readme faster,show code,tried stuck,sure reddit,learnpython faq,md question,python hard,tell
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
octoparse
octoparse
2838.43993421408
5362.0791015625
5875.5390625
3
2
226
0.018718
0
0.002795
0
0
31
Octoparse
4/19/2019 5:38:16 AM
0
1
0
0
False
False
False
False
True
False
t2_3mizogba
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Octoparse
12
12
2.28136882129278
1
0.190114068441065
0
0
287
54.5627376425856
526
Posted Commented
Commented Posted
Python learnpython datagangsta
Python learnpython datagangsta
png octoparse using tweets scraping format words data #x200b web
png tweets words #x200b width content opinion redd preview curation
web,scraping format,png png,width png,auto auto,webp content,curation webp,enabled preview,redd using,octoparse opinion,words
format,png png,width png,auto auto,webp content,curation webp,enabled preview,redd opinion,words web,scraping scraping,content
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
perpjattithero
perpjattithero
1
187.605651855469
5627.2861328125
1
1
0
0
0
0.002439
0
0
32
perpjattithero
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/perpjattithero
1
0
0
0
0
0
0
1
100
1
Posted
Posted
a:t5_2zg0i
a:t5_2zg0i
removed
removed
185.714285714286
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
sanbenito444
sanbenito444
151.660527480394
6649.201171875
9197.7978515625
5
3
12
0.00978
0
0.003484
0
0.5
33
Sanbenito444
9/17/2017 4:13:45 PM
0
62
32
0
False
False
False
False
True
False
t2_e4r33q1
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Sanbenito444
24
27
5.69620253164557
4
0.843881856540084
0
0
208
43.8818565400844
474
RepliedTo Posted
Posted RepliedTo
SuggestALaptop
SuggestALaptop
need use gaming computer good programs windows chrome os buy
need use gaming computer good programs windows chrome os buy
chrome,os programs,games design,online pick,include use,surf leave,finishing list,programs marketing,design include,apply factor,good
chrome,os programs,games design,online pick,include use,surf leave,finishing list,programs marketing,design include,apply factor,good
100
https://styles.redditmedia.com/t5_dp4wu/styles/profileIcon_snoo0b850da4-4b0a-49ff-9ca2-ccb593aa9b09-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a8e0aa4b4839a8d18850c62f461eaea02e3f9f07
legos45
legos45
1
6661.43603515625
9927.7822265625
1
1
0
0.005589
0
0.002178
0
1
34
legos45
2/2/2016 8:17:24 PM
0
60763
26247
0
False
False
False
False
True
False
t2_ucznh
False
False
True
https://styles.redditmedia.com/t5_dp4wu/styles/profileIcon_snoo0b850da4-4b0a-49ff-9ca2-ccb593aa9b09-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a8e0aa4b4839a8d18850c62f461eaea02e3f9f07
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/legos45
24
2
1.6
0
0
0
0
72
57.6
125
Commented
Commented
SuggestALaptop
SuggestALaptop
ideapad walmart 330s 15 device gb ryzen core processor windows
ideapad walmart 330s 15 device gb ryzen core processor windows
ideapad,330s 81fb00hkus,2f273186587 money,contribute goto,walmart hi,think basic,specifications veh,aff think,lenovo core,processor 1883484,565706
ideapad,330s 81fb00hkus,2f273186587 money,contribute goto,walmart hi,think basic,specifications veh,aff think,lenovo core,processor 1883484,565706
100
https://styles.redditmedia.com/t5_3lb95/styles/profileIcon_snooc7a457de-7c9e-4571-900f-4dbcd0e08948-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=30ff2c63ee7a8ee7b508f1900999a2efc4e03572
elvinelmo
elvinelmo
1
6636.96728515625
8467.814453125
0
1
0
0.005589
0
0.002178
0
0
35
elvinelmo
5/5/2017 12:38:38 AM
0
4166
4355
0
False
False
False
False
True
False
t2_bnf5xp
False
False
True
https://styles.redditmedia.com/t5_3lb95/styles/profileIcon_snooc7a457de-7c9e-4571-900f-4dbcd0e08948-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=30ff2c63ee7a8ee7b508f1900999a2efc4e03572
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/elvinelmo
24
4
3.6036036036036
1
0.900900900900901
0
0
69
62.1621621621622
111
Commented
Commented
SuggestALaptop
SuggestALaptop
asus vivobook 15 ssd laptop ram en_us as_li_ss_tl soon 3200u
asus vivobook 15 ssd laptop ram en_us as_li_ss_tl soon 3200u
asus,vivobook vivobook,15 need,worry ssd,anytime 16gb,ddr4 decent,specs 16,gb more,enough tag,laptop04f1 265,sr
asus,vivobook vivobook,15 need,worry ssd,anytime 16gb,ddr4 decent,specs 16,gb more,enough tag,laptop04f1 265,sr
100
https://styles.redditmedia.com/t5_7ilwp/styles/profileIcon_snoo30b3e69e-100d-4a15-bfdf-76f6fe7f2655-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f024a94cd193bbad8b3f2d5a632402e02581905f
lonerim2
lonerim2
1
6293.470703125
9222.904296875
0
1
0
0.005589
0
0.002178
0
0
36
LonerIM2
5/4/2016 8:28:33 PM
0
4014
13641
0
False
False
False
False
True
False
t2_xox32
False
False
True
https://styles.redditmedia.com/t5_7ilwp/styles/profileIcon_snoo30b3e69e-100d-4a15-bfdf-76f6fe7f2655-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f024a94cd193bbad8b3f2d5a632402e02581905f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/LonerIM2
24
9
4.52261306532663
0
0
0
0
104
52.2613065326633
199
Commented
Commented
SuggestALaptop
SuggestALaptop
lid comes laptop somewhat range aspire ram usb life battery
lid comes laptop somewhat range aspire ram usb life battery
acer,aspire battery,life ports,including bkadamos_alltest,20 check,laptop aspire,amazon numeric,keypad somewhat,standards premium,look under,500
acer,aspire battery,life ports,including bkadamos_alltest,20 check,laptop aspire,amazon numeric,keypad somewhat,standards premium,look under,500
100
https://styles.redditmedia.com/t5_27swe0/styles/profileIcon_vcua087l1kr41.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7b32edb21288873bc57ac1a0e21c37bd10d2cd75
alex2440933
alex2440933
1
7004.93212890625
9172.69140625
1
1
0
0.005589
0
0.002178
0
1
37
alex2440933
11/4/2019 10:28:45 AM
0
1
2
0
False
False
False
False
True
False
t2_4ungwgb6
False
False
False
https://styles.redditmedia.com/t5_27swe0/styles/profileIcon_vcua087l1kr41.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7b32edb21288873bc57ac1a0e21c37bd10d2cd75
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/alex2440933
24
0
0
0
0
0
0
12
46.1538461538462
26
Commented
Commented
SuggestALaptop
SuggestALaptop
scraper shopify downloadable using hey octoparse saas app projects ecommerce
scraper shopify downloadable using hey octoparse saas app projects ecommerce
scraper,shopify app,using shopify,saas saas,ecommerce using,scraper shopify,scraper octoparse,downloadable downloadable,app ecommerce,projects hey,octoparse
scraper,shopify app,using shopify,saas saas,ecommerce using,scraper shopify,scraper octoparse,downloadable downloadable,app ecommerce,projects hey,octoparse
114.285714285714
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
weedergate
weedergate
26.1100879133989
8727.48828125
6300.25341796875
3
3
2
0.00489
0
0.002882
0
1
38
WeederGate
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/WeederGate
47
8
4.76190476190476
2
1.19047619047619
0
0
73
43.4523809523809
168
RepliedTo Posted
Posted RepliedTo
datamining
datamining
end front software scraping script code looking click printing back
scraping code end front printing back start thanks great acquired
front,end click,front being,acquired expanding,gone clear,looking hopefully,question ok,cool start,surprised hello,everyone more,programs
being,acquired expanding,gone clear,looking hopefully,question ok,cool start,surprised hello,everyone more,programs visual,scraping appreciate,insight
100
https://styles.redditmedia.com/t5_20w2t9/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMTQyNDcx_rare_9d8c5a91-cce1-449a-bd4b-8952394f20de-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8e3afd6a2e923cb84f06965517d029b2be718ee5
simius
simius
1
8490.1083984375
5668.94873046875
1
1
0
0.00326
0
0.002217
0
1
39
Simius
8/20/2010 5:05:03 PM
0
678
3403
0
False
False
False
False
True
False
t2_49nkx
False
False
True
https://styles.redditmedia.com/t5_20w2t9/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMTQyNDcx_rare_9d8c5a91-cce1-449a-bd4b-8952394f20de-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8e3afd6a2e923cb84f06965517d029b2be718ee5
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Simius
47
3
10.7142857142857
0
0
0
0
11
39.2857142857143
28
Commented
Commented
datamining
datamining
helped select convertthat great palantir elements tool scheduled scrape before
helped select convertthat great palantir elements tool scheduled scrape before
helped,convertthat select,elements scheduled,scrape helped,select tool,before palantir,basically elements,helped great,tool acquired,palantir kimono,great
helped,convertthat select,elements scheduled,scrape helped,select tool,before palantir,basically elements,helped great,tool acquired,palantir kimono,great
100
https://styles.redditmedia.com/t5_85uws/styles/profileIcon_snoo16d465a5-0294-4a7e-908f-b53d08a74ebd-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3764651befd8cbb3c8ee6af6853b7ba821d917b5
surlynacho
surlynacho
1
8964.416015625
6950.87158203125
1
1
0
0.00326
0
0.002217
0
1
40
SurlyNacho
6/23/2017 11:17:38 AM
0
558
737
0
False
False
False
False
True
False
t2_4wikxc6
False
False
False
https://styles.redditmedia.com/t5_85uws/styles/profileIcon_snoo16d465a5-0294-4a7e-908f-b53d08a74ebd-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3764651befd8cbb3c8ee6af6853b7ba821d917b5
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/SurlyNacho
47
0
0
0
0
0
0
6
60
10
Commented
Commented
datamining
datamining
software one alternatives net octoparse alternativeto
software one alternatives net octoparse alternativeto
alternatives,alternativeto alternativeto,net software,octoparse net,software one,alternatives
alternatives,alternativeto alternativeto,net software,octoparse net,software one,alternatives
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
mrpenguin9
mrpenguin9
1
4380.92431640625
7178.76904296875
0
1
0
0.012863
0
0.002144
0
0
41
MrPenguin9
5/17/2018 7:20:26 PM
0
9014
5637
0
False
False
False
False
True
False
t2_4f7qh01
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/MrPenguin9
7
0
0
1
12.5
0
0
4
50
8
Commented
Commented
softwaregore
softwaregore
going crisis having see existential
going crisis having see existential
crisis,going existential,crisis having,existential going,see
crisis,going existential,crisis having,existential going,see
814.285714285714
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
twiggy3
twiggy3
1256.50439566995
4275.34765625
6363.3876953125
10
1
100
0.022757
0
0.004701
0
0
42
Twiggy3
11/2/2010 1:25:09 PM
0
6145
19693
0
False
False
False
False
True
False
t2_4hgfk
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Twiggy3
7
Posted
Posted
softwaregore
softwaregore
100
https://styles.redditmedia.com/t5_8fh0t/styles/profileIcon_snooa9ebb8ba-24b4-4f61-bff5-62b4f8ba67bc-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d2a95e6c01669467f0300b048e774311c82c2945
ssupii
ssupii
1
4778.650390625
6502.00146484375
0
1
0
0.012863
0
0.002144
0
0
43
SSUPII
8/21/2017 11:08:13 AM
0
10412
37609
0
False
False
False
False
True
False
t2_b9gkzub
False
False
True
https://styles.redditmedia.com/t5_8fh0t/styles/profileIcon_snooa9ebb8ba-24b4-4f61-bff5-62b4f8ba67bc-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d2a95e6c01669467f0300b048e774311c82c2945
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/SSUPII
7
1
20
0
0
0
0
2
40
5
Commented
Commented
softwaregore
softwaregore
seems meme surreal
seems meme surreal
surreal,meme seems,surreal
surreal,meme seems,surreal
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
hughjanus0
hughjanus0
1
4127.90625
4721.75
0
1
0
0.010202
0
0.002206
0
0
44
hughjanus0
9/27/2017 10:13:54 PM
0
172910
155120
0
False
False
False
False
True
False
t2_fap7s69
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/hughjanus0
7
0
0
0
0
0
0
1
50
2
RepliedTo
RepliedTo
softwaregore
softwaregore
technicallythetruth
technicallythetruth
300
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
fdf2002
fdf2002
352.541230787585
4319.0810546875
5486.43310546875
1
2
28
0.015571
0
0.002653
0
0
45
fdf2002
8/11/2016 12:25:43 AM
0
9030
18158
0
False
False
False
False
True
False
t2_10c39t
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/fdf2002
7
0
0
0
0
0
0
2
66.6666666666667
3
RepliedTo Commented
Commented RepliedTo
softwaregore
softwaregore
literally engrish
literally engrish
114.285714285714
https://styles.redditmedia.com/t5_ampzj/styles/profileIcon_mz89zchto6b11.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=0f489f443b82871b4e85129f486ad17d3985a247
nonamerequiredxd
nonamerequiredxd
26.1100879133989
4762.076171875
5103.451171875
1
1
2
0.010957
0
0.00238
0
0
46
NoNameRequiredxD
1/10/2018 8:23:31 PM
0
3427
67306
0
False
False
False
False
True
False
t2_ri4l3mj
False
False
True
https://styles.redditmedia.com/t5_ampzj/styles/profileIcon_mz89zchto6b11.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=0f489f443b82871b4e85129f486ad17d3985a247
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/NoNameRequiredxD
7
0
0
0
0
0
0
1
50
2
RepliedTo
RepliedTo
softwaregore
softwaregore
suicidebywords
suicidebywords
157.142857142857
https://styles.redditmedia.com/t5_my7xl/styles/profileIcon_zq8ta4q7aq021.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0c6aae6ea475eae907824efaab1a5568bddc01eb
dinojl
dinojl
101.440351653596
4673.61767578125
5877.83203125
1
1
8
0.014088
0
0.002322
0
0
47
dinojl
8/9/2018 1:10:36 AM
0
221
15387
0
False
False
False
False
True
False
t2_1y4imqhk
False
False
False
https://styles.redditmedia.com/t5_my7xl/styles/profileIcon_zq8ta4q7aq021.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0c6aae6ea475eae907824efaab1a5568bddc01eb
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/dinojl
7
0
0
0
0
0
0
0
0
1
Commented
Commented
softwaregore
softwaregore
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
matrixyst
matrixyst
1
3914.95458984375
5914.82177734375
0
1
0
0.012863
0
0.002144
0
0
48
matrixyst
7/13/2016 8:57:20 AM
0
21649
13135
0
False
False
False
False
True
False
t2_zgvut
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/matrixyst
7
0
0
2
10
0
0
8
40
20
Commented
Commented
softwaregore
softwaregore
way longer spent problem ve exist misread trying exit figure
way longer spent problem ve exist misread trying exit figure
trying,figure ve,trying spent,way exit,spent way,longer longer,ve misread,exist exist,exit figure,problem
trying,figure ve,trying spent,way exit,spent way,longer longer,ve misread,exist exist,exit figure,problem
100
https://styles.redditmedia.com/t5_7j4oh/styles/profileIcon_snoo386c16f7-9a01-4fd9-9660-d367636a6ff4-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=27b818eb3c6a7bed5f97e40611831a9569658263
missile500
missile500
1
4655.4765625
6952.3271484375
0
1
0
0.012863
0
0.002144
0
0
49
missile500
8/12/2017 9:17:39 PM
0
48121
31383
0
False
False
False
False
True
False
t2_9y33b05
False
False
True
https://styles.redditmedia.com/t5_7j4oh/styles/profileIcon_snoo386c16f7-9a01-4fd9-9660-d367636a6ff4-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=27b818eb3c6a7bed5f97e40611831a9569658263
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/missile500
7
0
0
0
0
0
0
7
87.5
8
Commented
Commented
softwaregore
softwaregore
breached redacted scp again task containment 079
breached redacted scp again task containment 079
breached,containment task,redacted scp,079 redacted,scp containment,again 079,breached
breached,containment task,redacted scp,079 redacted,scp containment,again 079,breached
100
https://styles.redditmedia.com/t5_1drj9x/styles/profileIcon_ay46h16gk3m61.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=04e3c31891cd6fb7acc412e69c75f9bf19409bab
techgineer13
techgineer13
1
3842.12768554688
6796.5927734375
0
1
0
0.012863
0
0.002144
0
0
50
techgineer13
5/16/2016 11:26:19 PM
0
28127
28526
0
False
False
False
False
True
False
t2_xzhho
False
False
True
https://styles.redditmedia.com/t5_1drj9x/styles/profileIcon_ay46h16gk3m61.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=04e3c31891cd6fb7acc412e69c75f9bf19409bab
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/techgineer13
7
0
0
0
0
0
0
2
66.6666666666667
3
Commented
Commented
softwaregore
softwaregore
ing intensifies
ing intensifies
ing,intensifies
ing,intensifies
100
https://styles.redditmedia.com/t5_f9du9/styles/profileIcon_snoo9996c46a-6e56-4b79-9c92-220d713df1ed-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3cfb54834ffeb343913f30117c16ac285a27bd22
mrpopzicle-supercard
mrpopzicle-supercard
1
4074.51489257813
7130.88330078125
0
1
0
0.012863
0
0.002144
0
0
51
MrPopzicle-Supercard
1/31/2018 9:31:14 PM
0
14770
11379
0
False
False
False
False
True
False
t2_bbpq9lt
False
False
True
https://styles.redditmedia.com/t5_f9du9/styles/profileIcon_snoo9996c46a-6e56-4b79-9c92-220d713df1ed-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3cfb54834ffeb343913f30117c16ac285a27bd22
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/MrPopzicle-Supercard
7
0
0
0
0
0
0
2
40
5
Commented
Commented
softwaregore
softwaregore
continue screen
continue screen
continue,screen
continue,screen
100
https://styles.redditmedia.com/t5_ifhrz/styles/profileIcon_a9p63w6ozzw51.png?width=256&height=256&crop=256:256,smart&v=enabled&s=be987ca8d449bfd3075c979df5fcf9a2c8e641d5
kotauskas
kotauskas
1
3735.17333984375
6302.4677734375
0
1
0
0.012863
0
0.002144
0
0
52
Kotauskas
4/24/2018 3:43:34 PM
0
65203
67609
0
False
False
False
False
True
False
t2_z8vr74e
False
False
False
https://styles.redditmedia.com/t5_ifhrz/styles/profileIcon_a9p63w6ozzw51.png?width=256&height=256&crop=256:256,smart&v=enabled&s=be987ca8d449bfd3075c979df5fcf9a2c8e641d5
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Kotauskas
7
0
0
0
0
0
0
3
50
6
Commented
Commented
softwaregore
softwaregore
button true reset
button true reset
reset,button true,reset
reset,button true,reset
100
https://styles.redditmedia.com/t5_6s9gsz/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMzAwMjMz_rare_cd6fa97f-dc35-4e12-beb8-799e3011efcf-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c01b6436fa6bf70b331741f9b4ee15bed626e158
grouchy_document7786
grouchy_document7786
1
7311.75048828125
8052.96923828125
0
1
0
0.004401
0
0.002192
0
0
53
Grouchy_Document7786
7/29/2022 8:54:32 AM
0
244
802
0
False
False
False
False
True
False
t2_qj15embb
False
False
False
https://styles.redditmedia.com/t5_6s9gsz/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMzAwMjMz_rare_cd6fa97f-dc35-4e12-beb8-799e3011efcf-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c01b6436fa6bf70b331741f9b4ee15bed626e158
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Grouchy_Document7786
36
1
2.85714285714286
1
2.85714285714286
0
0
15
42.8571428571429
35
Commented
Commented
data
data
kaggle versatile tons more reason look one use much learn
kaggle versatile tons more reason look one use much learn
learn,python python,one competitions,use much,more versatile,look kaggle,kaggle suggest,learn more,versatile use,analyze kaggle,tons
learn,python python,one competitions,use much,more versatile,look kaggle,kaggle suggest,learn more,versatile use,analyze kaggle,tons
142.857142857143
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
ethanwii
ethanwii
76.3302637401968
7311.75048828125
7365.71630859375
4
1
6
0.007335
0
0.003179
0
0
54
Ethanwii
4/5/2019 11:45:27 PM
0
5
4
0
False
False
False
False
True
False
t2_22bdq3zl
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Ethanwii
36
2
2.98507462686567
1
1.49253731343284
0
0
28
41.7910447761194
67
Posted
Posted
data
data
data know service projects queries sql octoparse hello atm hates
data know service projects queries sql octoparse hello atm hates
position,showcase know,basic atm,know really,find internship,projects data,scraping sql,queries happy,figure figure,things projects,position
position,showcase know,basic atm,know really,find internship,projects data,scraping sql,queries happy,figure figure,things projects,position
100
https://styles.redditmedia.com/t5_c1gar/styles/profileIcon_snoo912e662b-3874-409a-9d9b-de77a76787b1-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=be3459676a4b8e415d2700ebaeb0848258beb4eb
renridescycles
renridescycles
1
7012.34326171875
8052.96923828125
0
1
0
0.004401
0
0.002192
0
0
55
RenRidesCycles
12/4/2012 4:14:44 PM
0
97
20013
0
False
False
False
False
True
False
t2_9sk5k
False
False
False
https://styles.redditmedia.com/t5_c1gar/styles/profileIcon_snoo912e662b-3874-409a-9d9b-de77a76787b1-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=be3459676a4b8e415d2700ebaeb0848258beb4eb
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/RenRidesCycles
36
3
2.22222222222222
1
0.740740740740741
0
0
61
45.1851851851852
135
Commented
Commented
data
data
interested data process etc more something question see task technical
interested data process etc more something question see task technical
datasets,write more,seeing task,something question,ask problem,worked manager,more recommendation,algorithm write,process spam,detector city,see
datasets,write more,seeing task,something question,ask problem,worked manager,more recommendation,algorithm write,process spam,detector city,see
100
https://styles.redditmedia.com/t5_7mj1aa/styles/profileIcon_snoo61e78ee6-21cd-4033-b154-0387cbf1a3f8-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2f707c91a3458c34acd7c1b15be45cca68d0a570
alfarabi-logic
alfarabi-logic
1
7012.34326171875
7365.71630859375
0
1
0
0.004401
0
0.002192
0
0
56
alfarabi-logic
12/20/2022 11:34:11 PM
0
2
27
0
False
False
False
False
True
False
t2_v3ydm12o
False
False
False
https://styles.redditmedia.com/t5_7mj1aa/styles/profileIcon_snoo61e78ee6-21cd-4033-b154-0387cbf1a3f8-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2f707c91a3458c34acd7c1b15be45cca68d0a570
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/alfarabi-logic
36
0
0
0
0
0
0
6
42.8571428571429
14
Commented
Commented
data
data
data engineer analyst scientist
data engineer analyst scientist
data,engineer scientist,data analyst,data data,analyst data,scientist
data,engineer scientist,data analyst,data data,analyst data,scientist
100
https://styles.redditmedia.com/t5_7jz66/styles/profileIcon_snood0c5c497-f881-4197-9cb9-7c1289284151-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=35bb142bf03790e79c584a611076d6ded03049b6
kingofgamesyami
kingofgamesyami
1
6293.470703125
3291.4580078125
1
1
0
0.004401
0
0.002192
0
1
57
KingofGamesYami
5/31/2017 6:05:53 PM
0
661
28616
0
False
False
False
False
True
False
t2_1kgt50y
False
False
False
https://styles.redditmedia.com/t5_7jz66/styles/profileIcon_snood0c5c497-f881-4197-9cb9-7c1289284151-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=35bb142bf03790e79c584a611076d6ded03049b6
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/KingofGamesYami
33
1
2.63157894736842
2
5.26315789473684
0
0
16
42.1052631578947
38
Commented RepliedTo
Commented RepliedTo
AskProgramming
AskProgramming
much application api simpler webscraping public make tedious question prone
much application api simpler webscraping public make tedious question prone
much,much question,public interface,examples webscraping,tedious application,programming otherwise,lot api,make programming,interface public,api error,prone
much,much question,public interface,examples webscraping,tedious application,programming otherwise,lot api,make programming,interface public,api error,prone
142.857142857143
https://styles.redditmedia.com/t5_24eezy/styles/profileIcon_kwxxyhurmrk51.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=adff798bf578ea7293e7b38fc1a4e19841507adf
theofficialjewses
theofficialjewses
76.3302637401968
6565.23486328125
3786.81665039063
4
3
6
0.007335
0
0.003179
0
0.666666666666667
58
theofficialjewses
9/6/2019 8:17:26 PM
0
5579
332
0
False
False
False
False
True
False
t2_4j8g8po1
False
False
False
https://styles.redditmedia.com/t5_24eezy/styles/profileIcon_kwxxyhurmrk51.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=adff798bf578ea7293e7b38fc1a4e19841507adf
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/theofficialjewses
33
7
0.938337801608579
6
0.804289544235925
0
0
305
40.8847184986595
746
RepliedTo Posted
Posted RepliedTo
webdev AskProgramming
AskProgramming webdev
cars mmr auction vin autocheck find put notes go need
cars mmr auction vin autocheck find put notes go need
vin,number put,mmr mmr,calculator condition,report number,mileage automatically,put pull,data mmr,value calculator,autocheck data,auction
vin,number put,mmr mmr,calculator condition,report number,mileage automatically,put pull,data mmr,value calculator,autocheck data,auction
100
https://styles.redditmedia.com/t5_7r692/styles/profileIcon_snood151518e-5ac4-4e1e-b662-e7b74dc8af79-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9a17e6f1363d4c3b2b48cfb24b8c8fdc2770d245
turbospeedsc
turbospeedsc
1
2843.65307617188
2682.53637695313
0
1
0
0.011927
0
0.002155
0
0
59
turbospeedsc
7/25/2017 2:55:07 AM
0
625
25524
0
False
False
False
False
True
False
t2_34f0wl
False
False
False
https://styles.redditmedia.com/t5_7r692/styles/profileIcon_snood151518e-5ac4-4e1e-b662-e7b74dc8af79-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9a17e6f1363d4c3b2b48cfb24b8c8fdc2770d245
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/turbospeedsc
4
1
9.09090909090909
0
0
0
0
6
54.5454545454545
11
Commented
Commented
webscraping
webscraping
using octo works over 500k records scrapped
using octo works over 500k records scrapped
records,using over,500k using,octo works,scrapped scrapped,over 500k,records
records,using over,500k using,octo works,scrapped scrapped,over 500k,records
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
canigetahoyahhh
canigetahoyahhh
1783.81624185133
2677.10498046875
2326.576171875
6
1
142
0.015524
0
0.003288
0
0
60
canigetahoyahhh
1/27/2021 4:13:45 PM
0
10
-9
0
False
False
False
False
True
False
t2_a0qsdjk6
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/canigetahoyahhh
4
8
6.45161290322581
7
5.64516129032258
0
0
40
32.258064516129
124
Posted
Posted
webscraping
webscraping
refund dissatisfied policy trial scraped use anything didn bloody user
refund dissatisfied policy trial scraped use anything didn bloody user
refund,policy didn,use free,trial refund,long use,anything user,dissatisfied anything,scraped long,user forgot,cancel dissatisfied,didn
refund,policy didn,use free,trial refund,long use,anything user,dissatisfied anything,scraped long,user forgot,cancel dissatisfied,didn
100
https://styles.redditmedia.com/t5_2g47kd/styles/profileIcon_snooea11db97-7757-45e0-a4e8-5a9272dc374c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c640cb0ae4535188c3bf95a5175459d52d539185
calson3asab
calson3asab
1
2468.53466796875
2554.13793945313
0
1
0
0.011927
0
0.002155
0
0
61
calson3asab
2/23/2020 7:28:38 AM
0
189
372
0
False
False
False
False
True
False
t2_5rcj9b7g
False
False
False
https://styles.redditmedia.com/t5_2g47kd/styles/profileIcon_snooea11db97-7757-45e0-a4e8-5a9272dc374c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c640cb0ae4535188c3bf95a5175459d52d539185
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/calson3asab
4
0
0
0
0
0
0
6
42.8571428571429
14
Commented
Commented
webscraping
webscraping
version web policy old archive find
version web policy old archive find
find,old version,web policy,version web,archive old,policy
find,old version,web policy,version web,archive old,policy
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
oldnothing7241
oldnothing7241
1
2270.74829101563
1968.578125
0
1
0
0.011643
0
0.002204
0
0
62
OldNothing7241
7/21/2021 10:26:37 PM
0
1
0
0
False
False
False
False
True
False
t2_dfzna8er
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
False
True
Open Reddit Page for This Person
https://www.reddit.com/user/OldNothing7241
4
3
3.15789473684211
5
5.26315789473684
0
0
30
31.5789473684211
95
RepliedTo
RepliedTo
webscraping
webscraping
forget refund make shit service standard signing easily chargeback switch
forget refund make shit service standard signing easily chargeback switch
bull,shit bait,switch service,forget put,crap forget,bull higher,standard dissatisfied,signing refund,policy signing,hopefully shitty,software
bull,shit bait,switch service,forget put,crap forget,bull higher,standard dissatisfied,signing refund,policy signing,hopefully shitty,software
514.285714285714
https://styles.redditmedia.com/t5_8zhiy/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMDI2ODc4_rare_a998c3b5-7760-4f37-852d-9a100c455760-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9554ba3feb18ad71acb75bfd7a5d9fbda1e78281
nabinator
nabinator
729.19254948857
2525.83374023438
1976.26940917969
2
2
58
0.015046
0
0.002609
0
0.333333333333333
63
Nabinator
5/15/2017 3:49:39 AM
0
245
5392
0
False
False
False
False
True
False
t2_15oldue
False
False
False
https://styles.redditmedia.com/t5_8zhiy/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMDI2ODc4_rare_a998c3b5-7760-4f37-852d-9a100c455760-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9554ba3feb18ad71acb75bfd7a5d9fbda1e78281
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Nabinator
4
5
14.7058823529412
0
0
0
0
14
41.1764705882353
34
RepliedTo Commented
Commented RepliedTo
webscraping
webscraping
re cancel forgot ended trial part enough over being generous
cancel forgot ended trial part enough over being generous nonce
before,free cancel,before re,right re,nonce part,50 trial,ended generous,fair ended,don over,part enough,re
before,free cancel,before re,right re,nonce part,50 trial,ended generous,fair ended,don over,part enough,re
685.714285714286
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
robsm
robsm
1030.51360444936
2629.84594726563
1542.03991699219
2
3
82
0.016576
0
0.00247
0
0.666666666666667
64
RobSm
4/18/2018 3:12:52 PM
0
5
393
0
False
False
False
False
True
False
t2_17pe8it3
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/RobSm
4
4
11.7647058823529
2
5.88235294117647
0
0
13
38.2352941176471
34
Commented RepliedTo
RepliedTo Commented
webscraping
webscraping
trial refund 100 works over during use bottleneck fault transfer
trial refund 100 works over during use bottleneck fault transfer
50,fault programming,language during,trial data,transfer network,latency bottleneck,programming trial,over fault,100 language,network 100,refund
50,fault programming,language during,trial data,transfer network,latency bottleneck,programming trial,over fault,100 language,network 100,refund
371.428571428571
https://styles.redditmedia.com/t5_7hzh3s/styles/profileIcon_snoo27aa03f3-05a8-494a-919c-b6514dbe9118-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f8ca70458ab33cfe41a843e31a1ceb3ecbeeb07d
atwoodenterprise
atwoodenterprise
478.09167035458
2657.91137695313
2796.9970703125
1
2
38
0.012225
0
0.002494
0
0.5
65
AtwoodEnterprise
12/3/2022 12:08:11 AM
0
1065
62
0
False
False
False
False
True
False
t2_uq0thvv9
False
False
False
https://styles.redditmedia.com/t5_7hzh3s/styles/profileIcon_snoo27aa03f3-05a8-494a-919c-b6514dbe9118-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f8ca70458ab33cfe41a843e31a1ceb3ecbeeb07d
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/AtwoodEnterprise
4
5
4.5045045045045
0
0
0
0
42
37.8378378378378
111
RepliedTo Commented
Commented RepliedTo
webscraping
webscraping
html download different ctrl script press webpage back taught used
html download different ctrl script press webpage back taught used
download,html press,ctrl good,used webpages,download efficient,liked create,script liked,went design,script better,lol lot,different
download,html press,ctrl good,used webpages,download efficient,liked create,script liked,went design,script better,lol lot,different
100
https://styles.redditmedia.com/t5_2d49xv/styles/profileIcon_nnd6xh9v1vt41.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=71e4828a37e3a748ca90e7ba00f6c3c7af1f6551
abdkhaled
abdkhaled
1
2676.27685546875
3254.66015625
1
1
0
0.009879
0
0.00226
0
1
66
AbdKhaled
1/18/2020 12:14:20 AM
0
79
85
0
False
False
False
False
True
False
t2_5gwbc20n
False
False
False
https://styles.redditmedia.com/t5_2d49xv/styles/profileIcon_nnd6xh9v1vt41.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=71e4828a37e3a748ca90e7ba00f6c3c7af1f6551
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/AbdKhaled
4
1
5.55555555555556
0
0
0
0
6
33.3333333333333
18
RepliedTo
RepliedTo
webscraping
webscraping
same man enlighten clue thinking please start
same man enlighten clue thinking please start
clue,start same,clue thinking,same enlighten,thinking please,enlighten man,please
clue,start same,clue thinking,same enlighten,thinking please,enlighten man,please
885.714285714286
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
05_legend
05_legend
1382.05483523694
2845.93237304688
1996.19067382813
1
2
110
0.016862
0
0.002244
0
0.5
67
05_legend
8/1/2022 11:43:32 PM
0
4
2261
0
False
False
False
False
True
False
t2_qscti6o0
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/05_legend
4
0
0
0
0
0
0
4
23.5294117647059
17
Commented
Commented
webscraping
webscraping
very familiar octoparse trying
very familiar octoparse trying
familiar,octoparse trying,very very,familiar
familiar,octoparse trying,very very,familiar
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
manonmo
manonmo
1
1767.21984863281
5627.2861328125
1
1
0
0
0
0.002439
0
0
68
Manonmo
4/6/2022 7:12:37 AM
0
2
2
0
False
False
False
False
True
False
t2_lly26kyq
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Manonmo
1
0
0
0
0
0
0
1
100
1
Posted
Posted
webscraping
webscraping
removed
removed
100
https://styles.redditmedia.com/t5_6m0qtq/styles/profileIcon_snoob5e23622-b13b-43d4-9ed6-734c2c7b39d0-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1b8719c92213c1bd2f825d64a89a7847039669a0
venetianarsenale
venetianarsenale
1
8159.57568359375
1451.06567382813
0
1
0
0.002445
0
0.002439
0
0
69
VenetianArsenale
6/28/2022 5:13:13 PM
0
232
1255
0
False
False
False
False
True
False
t2_pbhzp94d
False
False
False
https://styles.redditmedia.com/t5_6m0qtq/styles/profileIcon_snoob5e23622-b13b-43d4-9ed6-734c2c7b39d0-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1b8719c92213c1bd2f825d64a89a7847039669a0
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/VenetianArsenale
71
2
2.73972602739726
0
0
0
0
35
47.9452054794521
73
RepliedTo
RepliedTo
webscraping
webscraping
capture using sportsbot nft marketplace rollbit information captured 'card' response
capture using sportsbot nft marketplace rollbit information captured 'card' response
rollbit,nft sportsbot,marketplace nft,sportsbot website,rollbit list,data 'listed,price' called,octoparse octoparse,tool data,being information,each
rollbit,nft sportsbot,marketplace nft,sportsbot website,rollbit list,data 'listed,price' called,octoparse octoparse,tool data,being information,each
100
https://styles.redditmedia.com/t5_6zxcot/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180OTU4NTgz_rare_bcbfcecf-403a-42b1-bc82-85137d0e504c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=05032083e6d87f731ae14d0916d2807970570527
hellonhac
hellonhac
1
8159.57568359375
1974.517578125
1
0
0
0.002445
0
0.002439
0
0
70
hellonhac
9/6/2022 5:40:27 PM
0
1
619
0
False
False
False
False
True
False
t2_pjz0zyxy
False
False
False
https://styles.redditmedia.com/t5_6zxcot/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180OTU4NTgz_rare_bcbfcecf-403a-42b1-bc82-85137d0e504c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=05032083e6d87f731ae14d0916d2807970570527
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/hellonhac
71
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
joelcorey
joelcorey
1
8086.94677734375
7022.08984375
1
1
0
0.003667
0
0.00221
0
1
71
joelcorey
7/19/2016 2:44:02 PM
0
59
723
0
False
False
False
False
True
False
t2_zn2t6
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/joelcorey
35
7
4.69798657718121
2
1.34228187919463
0
0
56
37.5838926174497
149
RepliedTo Commented
Commented RepliedTo
webscraping
webscraping
time scrape project happy puppeteer demo see need process somehow
time scrape project see happy puppeteer demo need process somehow
happy,demo built,ways select,document fine,skype available,whatever trulia,need document,queryselectall current,scraper time,tomorrow google,fine
happy,demo built,ways select,document fine,skype available,whatever trulia,need document,queryselectall current,scraper time,tomorrow google,fine
128.571428571429
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
doshka
doshka
51.2201758267979
7888.15966796875
7479.88232421875
3
2
4
0.005501
0
0.002732
0
0.5
72
doshka
2/17/2012 6:28:41 PM
0
745
32225
0
False
False
False
False
True
False
t2_6y6t0
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/doshka
35
17
3.33988212180747
4
0.785854616895874
0
0
203
39.8821218074656
509
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
elements detail way property xpath thanks one gui long trulia
elements detail way property xpath one gui trulia probably scripting
research,install way,go appreciate,demo way,navigate scrapestorm,octoparse take,dm 10,pc work,free limits,looking contains,listings
research,install way,go appreciate,demo way,navigate scrapestorm,octoparse take,dm 10,pc work,free limits,looking contains,listings
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
linkifybot
linkifybot
1
7491.09765625
8396.595703125
0
1
0
0.003667
0
0.002264
0
0
73
LinkifyBot
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/LinkifyBot
35
0
0
0
0
0
0
32
59.2592592592593
54
RepliedTo
RepliedTo
webscraping
webscraping
delete reddit message 20the scrapingdog 20delete comments 20button compose hyperlinked
delete reddit message 20the scrapingdog 20delete comments 20button compose hyperlinked
np,reddit 20send,20button 20the,20false compose,2fu message,click reddit,message 2fu,2flinkifybot click,20the comment,hyperlinked delete,20fswrjn2
np,reddit 20send,20button 20the,20false compose,2fu message,click reddit,message 2fu,2flinkifybot click,20the comment,hyperlinked delete,20fswrjn2
128.571428571429
https://styles.redditmedia.com/t5_nowu6/styles/profileIcon_snooeb2359c7-e29e-4ac4-a592-82bfbd2fa99b-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e9655f7ede38af538e5c3726dbae5293c2fbcb87
yakult2450
yakult2450
51.2201758267979
7690.04052734375
7939.14892578125
1
1
4
0.005501
0
0.002549
0
0
74
yakult2450
8/25/2018 8:34:40 PM
0
1551
44
0
False
False
False
False
True
False
t2_2292y409
False
False
False
https://styles.redditmedia.com/t5_nowu6/styles/profileIcon_snooeb2359c7-e29e-4ac4-a592-82bfbd2fa99b-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e9655f7ede38af538e5c3726dbae5293c2fbcb87
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/yakult2450
35
2
7.14285714285714
0
0
0
0
12
42.8571428571429
28
Commented
Commented
webscraping
webscraping
1000 pack calls using backend api try focus generous scrapingdog
1000 pack calls using backend api try focus generous scrapingdog
api,calls generous,free scrapingdog,focus try,provides provides,generous 1000,api collection,backend using,scrapingdog free,pack focus,data
api,calls generous,free scrapingdog,focus try,provides provides,generous 1000,api collection,backend using,scrapingdog free,pack focus,data
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
dhondtdoit
dhondtdoit
1
8159.57568359375
3092.6396484375
2
2
0
0.002445
0
0.002609
0
1
75
dhondtdoit
5/25/2021 1:10:50 PM
0
1
0
0
False
False
False
False
True
False
t2_cbqr1t1d
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/dhondtdoit
70
10
4.18410041841004
0
0
0
0
124
51.8828451882845
239
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
based scraping gui github jakopako config scraper still better solutions
based scraping gui github jakopako config scraper still better solutions
gui,based github,jakopako those,gui terminal,based jakopako,croncert croncert,config based,solutions extraction,parsing based,scraping providing,'smart'
gui,based github,jakopako those,gui terminal,based jakopako,croncert croncert,config based,solutions extraction,parsing based,scraping providing,'smart'
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
pyxru
pyxru
1
8159.57568359375
2569.1875
1
1
0
0.002445
0
0.002269
0
1
76
PyxRu
8/7/2018 10:27:06 PM
0
192
16
0
False
False
False
False
True
False
t2_1xuhlqm9
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/PyxRu
70
2
5.88235294117647
0
0
0
0
17
50
34
Commented
Commented
webscraping
webscraping
pxyup fitter github probably thing similar really tickets work came
pxyup fitter github probably thing similar really tickets work came
github,pxyup pxyup,fitter flight,tickets tickets,github thing,really probably,work work,together build,thing came,similar really,cool
github,pxyup pxyup,fitter flight,tickets tickets,github thing,really probably,work work,together build,thing came,similar really,cool
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
kartikoli
kartikoli
9999
675.701965332031
651.351257324219
8
6
796.333333
0.031548
2E-06
0.003796
0
0.714285714285714
77
kartikoli
7/30/2019 2:12:06 AM
0
20
17
0
False
False
False
False
True
False
t2_1lch2tc7
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/kartikoli
2
27
2.80957336108221
8
0.832466181061394
0
0
499
51.9250780437045
961
RepliedTo Posted Commented
Posted RepliedTo Commented
learnprogramming regex
regex learnprogramming
text data google octoparse scrape maps help using trying xpath
google maps data text phone co scrape rocketreach revenue duplicate
google,maps remove,duplicate rocketreach,co duplicate,text bing,maps octoparse,scrape phone,numbers make,work 2012,present double,text
google,maps rocketreach,co bing,maps phone,numbers 2012,present 10,yrs present,10 maps,search duplicate,text remove,duplicate
100
https://styles.redditmedia.com/t5_3jcdez/styles/profileIcon_snoof2cb09eb-b748-41be-bed8-4bebc8bd1065-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=614644423c2057de62fd857eed171ee7f99de8ae
barnefield
barnefield
1
451.796539306641
260.855712890625
1
1
0
0.024
1E-06
0.002144
0
1
78
BarneField
12/9/2020 8:39:49 PM
0
1
1138
0
False
False
False
False
True
False
t2_6ioshygd
False
False
False
https://styles.redditmedia.com/t5_3jcdez/styles/profileIcon_snoof2cb09eb-b748-41be-bed8-4bebc8bd1065-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=614644423c2057de62fd857eed171ee7f99de8ae
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/BarneField
2
1
1.21951219512195
0
0
0
0
41
50
82
Commented
Commented
regex
regex
replace regex string same 1st lock way functions whole repeated
replace regex string same 1st lock way functions whole repeated
app,using replace,whole well,very regex,way capture,group very,relevant include,app group,see string,regex expression,anchors
app,using replace,whole well,very regex,way capture,group very,relevant include,app group,see string,regex expression,anchors
100
https://styles.redditmedia.com/t5_3gmkcv/styles/profileIcon_snooaf6b354b-50d5-4485-8cfc-d2e4ac58e13a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ecb1264a7d76c69f6d35a3883355f6db2605d251
whereismybroom
whereismybroom
1
334.181579589844
449.780639648438
0
1
0
0.024
1E-06
0.002144
0
0
79
whereIsMyBroom
11/24/2020 2:01:34 PM
0
50
706
0
False
False
False
False
True
False
t2_55wqqt2w
False
False
False
https://styles.redditmedia.com/t5_3gmkcv/styles/profileIcon_snooaf6b354b-50d5-4485-8cfc-d2e4ac58e13a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ecb1264a7d76c69f6d35a3883355f6db2605d251
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/whereIsMyBroom
2
0
0
0
0
0
0
16
44.4444444444444
36
Commented
Commented
regex
regex
regex101 text matches try instance replaces rovnrb something lines duplicate
regex101 text matches try instance replaces rovnrb something lines duplicate
something,replace duplicate,text regex101,rovnrb first,instance given,text replaces,first matches,lines regex101,demo try,something text,regex101
something,replace duplicate,text regex101,rovnrb first,instance given,text replaces,first matches,lines regex101,demo try,something text,regex101
185.714285714286
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
raventbk
raventbk
151.660527480394
7390.3076171875
9197.7978515625
5
3
12
0.00978
0
0.003484
0
0.5
80
RavenTBK
2/10/2014 10:00:41 PM
0
32
124
0
False
False
False
False
True
False
t2_f7ps0
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/RavenTBK
23
9
2.83018867924528
8
2.51572327044025
0
0
131
41.1949685534591
318
RepliedTo Posted
Posted RepliedTo
programmingrequests
programmingrequests
manually back youtu odometer need script work time octoparse deep
manually youtu odometer need script work time scripts bcw6vyndr0c pretty
youtu,bcw6vyndr0c well,here's steps,first beef,work deep,webpage goes,creation time,read info,fine automate,getting day,everything
youtu,bcw6vyndr0c well,here's steps,first beef,work deep,webpage goes,creation time,read info,fine automate,getting day,everything
100
https://styles.redditmedia.com/t5_vyk4k/styles/profileIcon_9npebtajvje21.png?width=256&height=256&crop=256:256,smart&v=enabled&s=36d1d24a8b0742f0f32686dda75ca579c6814870
ascor8522
ascor8522
1
7377.1552734375
9927.7822265625
1
1
0
0.005589
0
0.002178
0
1
81
Ascor8522
2/4/2019 12:56:53 PM
0
611
517
0
False
False
False
False
True
False
t2_35fm5alv
False
False
True
https://styles.redditmedia.com/t5_vyk4k/styles/profileIcon_9npebtajvje21.png?width=256&height=256&crop=256:256,smart&v=enabled&s=36d1d24a8b0742f0f32686dda75ca579c6814870
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Ascor8522
23
2
6.45161290322581
1
3.2258064516129
0
0
10
32.258064516129
31
Commented
Commented
programmingrequests
programmingrequests
dm pretty complicated try really work much very give done
dm pretty complicated try really work much very give done
try,dm very,complicated dm,work language,really give,try work,details seem,very pretty,much done,pretty complicated,done
try,dm very,complicated dm,work language,really give,try work,details seem,very pretty,much done,pretty complicated,done
100
https://styles.redditmedia.com/t5_vjutk/styles/profileIcon_snoo2711811d-789f-4b1a-889b-1a748bd9242b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=309cc6202c59306e6c48908e08c83951927438d6
tnilk
tnilk
1
7403.4599609375
8467.814453125
0
1
0
0.005589
0
0.002178
0
0
82
tnilk
1/28/2019 9:13:41 PM
0
260
3895
0
False
False
False
False
True
False
t2_33pyq1el
False
False
True
https://styles.redditmedia.com/t5_vjutk/styles/profileIcon_snoo2711811d-789f-4b1a-889b-1a748bd9242b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=309cc6202c59306e6c48908e08c83951927438d6
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/tnilk
23
2
4.54545454545455
0
0
0
0
19
43.1818181818182
44
RepliedTo
RepliedTo
programmingrequests
programmingrequests
free still help found people send dm haven't developing time
free still help found people send dm haven't developing time
automation,scraping free,send time,see scraping,tool free,time still,haven't feel,free send,dm developing,automation help,developing
automation,scraping free,send time,see scraping,tool free,time still,haven't feel,free send,dm developing,automation help,developing
100
https://styles.redditmedia.com/t5_bohen/styles/profileIcon_snoo1f6b1d8f-88ad-4951-998f-d69dc6b0663c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=de2ec40c06c65d9d5e7e3efa5db882ad783d2463
gsxhidden
gsxhidden
1
7746.03857421875
9224.787109375
1
1
0
0.005589
0
0.002178
0
1
83
GSxHidden
2/1/2013 8:28:29 PM
0
800
4123
0
False
False
False
False
True
False
t2_afxnk
False
False
False
https://styles.redditmedia.com/t5_bohen/styles/profileIcon_snoo1f6b1d8f-88ad-4951-998f-d69dc6b0663c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=de2ec40c06c65d9d5e7e3efa5db882ad783d2463
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/GSxHidden
23
2
5.12820512820513
0
0
0
0
17
43.5897435897436
39
Commented
Commented
programmingrequests
programmingrequests
power double reporting yourself pretty see download automate check little
power double reporting yourself pretty see download automate check little
little,programming tab,double double,check desktop,microsoft yourself,little download,direct data,download programming,see check,data power,automate
little,programming tab,double double,check desktop,microsoft yourself,little download,direct data,download programming,see check,data power,automate
100
https://styles.redditmedia.com/t5_52b9rr/styles/profileIcon_snooed353f69-4680-4105-9a5e-3255e69d8610-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2c50b60e4588ee83d95ba750c21982295348c97e
trycatchlife
trycatchlife
1
7034.5771484375
9170.8076171875
0
1
0
0.005589
0
0.002178
0
0
84
TryCatchLife
9/19/2021 10:34:57 PM
0
810
324
0
False
False
False
False
True
False
t2_en35yvum
False
False
False
https://styles.redditmedia.com/t5_52b9rr/styles/profileIcon_snooed353f69-4680-4105-9a5e-3255e69d8610-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2c50b60e4588ee83d95ba750c21982295348c97e
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/TryCatchLife
23
2
8
2
8
0
0
10
40
25
Commented
Commented
programmingrequests
programmingrequests
navigating success maybe interfaces working cypress great overkill use find
navigating success maybe interfaces working cypress great overkill use find
working,solution maybe,overkill use,cypress solution,use user,interfaces overkill,great navigating,complex interfaces,data find,working complex,user
working,solution maybe,overkill use,cypress solution,use user,interfaces overkill,great navigating,complex interfaces,data find,working complex,user
100
https://styles.redditmedia.com/t5_4ca73s/styles/profileIcon_fcbujbofhe571.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=82c45c2a2c9e6a49ff8a63f69b35f3159112bede
pekinson
pekinson
1
2083.142578125
5627.2861328125
1
1
0
0
0
0.002439
0
0
85
pekinson
5/2/2021 12:53:40 AM
0
3
0
0
False
False
False
False
True
False
t2_b7qu9bs1
False
False
True
https://styles.redditmedia.com/t5_4ca73s/styles/profileIcon_fcbujbofhe571.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=82c45c2a2c9e6a49ff8a63f69b35f3159112bede
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/pekinson
1
4
3.05343511450382
1
0.763358778625954
0
0
71
54.1984732824427
131
Posted
Posted
u_pekinson
u_pekinson
octoparse mode scraping advanced task url articles helpcenter many data
octoparse mode scraping advanced task url articles helpcenter many data
advanced,mode helpcenter,octoparse web,scraping octoparse,hc hc,articles articles,900003158843 webpage,click called,octoparse tool,helps suggest,try
advanced,mode helpcenter,octoparse web,scraping octoparse,hc hc,articles articles,900003158843 webpage,click called,octoparse tool,helps suggest,try
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
nubela
nubela
1
2270.74829101563
7550.25732421875
0
1
0
0.023961
0.153925
0.002133
0
0
86
nubela
2/24/2008 5:01:28 PM
0
9018
8419
0
False
False
False
False
True
False
t2_33fs9
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/nubela
3
0
0
0
0
0
0
1
100
1
Commented
Commented
webscraping
webscraping
spam
spam
100
https://styles.redditmedia.com/t5_5te52/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18zMjc2Mzc2_rare_644454af-6940-45d9-acf2-fedb623f420a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ce4976799e4dcae255e4eeaafd632d74a1200914
haiko_hayn
haiko_hayn
1
9374.9892578125
4210.76123046875
0
1
0
0.002445
0
0.002269
0
0
87
Haiko_Hayn
10/20/2017 6:37:18 AM
0
978
239
0
False
False
False
False
True
False
t2_hzb1ko4
False
False
False
https://styles.redditmedia.com/t5_5te52/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18zMjc2Mzc2_rare_644454af-6940-45d9-acf2-fedb623f420a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ce4976799e4dcae255e4eeaafd632d74a1200914
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Haiko_Hayn
69
2
5.12820512820513
2
5.12820512820513
0
0
16
41.025641025641
39
Commented
Commented
scrapinghub
scrapinghub
problem websites give think info moz online job decently right
problem websites give think info moz online job decently right
problem,solutions using,online think,job right,think quite,popular datahen,give services,websites job,decently info,problem popular,right
problem,solutions using,online think,job right,think quite,popular datahen,give services,websites job,decently info,problem popular,right
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
lucid-dreamx
lucid-dreamx
1
9374.9892578125
3687.30908203125
2
1
0
0.002445
0
0.002609
0
0
88
Lucid-Dreamx
9/6/2015 8:40:17 AM
0
10
1
0
False
False
False
False
True
False
t2_q793g
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Lucid-Dreamx
69
5
2.08333333333333
5
2.08333333333333
0
0
91
37.9166666666667
240
Posted
Posted
scrapinghub
scrapinghub
see arrives box appears content edge data show div figure
see arrives box appears content edge data show div figure
content,grabber doesn,show chrome,edge see,arrives figure,unique never,others fminer,pro box,looks div,figure user,agent
content,grabber doesn,show chrome,edge see,arrives figure,unique never,others fminer,pro box,looks div,figure user,agent
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
justinprather
justinprather
1
9111.896484375
6630.39111328125
0
1
0
0.00326
0
0.002217
0
0
89
justinprather
1/27/2016 10:39:55 PM
0
52
18
0
False
False
False
False
True
False
t2_u67mr
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/justinprather
46
1
1.31578947368421
0
0
0
0
35
46.0526315789474
76
Commented
Commented
webscraping
webscraping
table data need row way looking easier whole scanned kind
table data need row way looking easier whole scanned kind
store,links valid,data row,looking lambda,sqs easier,scan links,data table,store go,back entire,table way,programmatically
store,links valid,data row,looking lambda,sqs easier,scan links,data table,store go,back entire,table way,programmatically
114.285714285714
https://styles.redditmedia.com/t5_eoijb/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNmFjYjhmYjgyODgwZDM5YzJiODQ0NmY4Nzc4YTE0ZDM0ZWU2Y2ZiN18zNDYwMTk_rare_bae9695d-bf02-46a5-a598-318d19992278-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c15f8a50e690f95fb68d8ecf4e181b29f387ac5d
juanreasley
juanreasley
26.1100879133989
9111.896484375
5989.4296875
3
1
2
0.00489
0
0.002882
0
0
90
JuanReasley
1/19/2012 4:10:46 AM
0
6791
4539
0
False
False
False
False
True
False
t2_6p3aj
False
False
False
https://styles.redditmedia.com/t5_eoijb/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNmFjYjhmYjgyODgwZDM5YzJiODQ0NmY4Nzc4YTE0ZDM0ZWU2Y2ZiN18zNDYwMTk_rare_bae9695d-bf02-46a5-a598-318d19992278-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c15f8a50e690f95fb68d8ecf4e181b29f387ac5d
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/JuanReasley
46
2
1.92307692307692
1
0.961538461538462
0
0
41
39.4230769230769
104
Posted
Posted
webscraping
webscraping
data row link entries extract click table need paginated same
data row link entries extract click table need paginated same
extract,data second,bit look,row way,workflow octoparse,look problem,several row,click click,another row,table several,entries
extract,data second,bit look,row way,workflow octoparse,look problem,several row,click click,another row,table several,entries
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
brycedavies
brycedavies
1
9347.5673828125
6630.39111328125
0
1
0
0.00326
0
0.002217
0
0
91
brycedavies
7/10/2020 10:58:17 AM
0
74
31
0
False
False
False
False
True
False
t2_633cvyh3
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/brycedavies
46
5
5.15463917525773
0
0
0
0
43
44.3298969072165
97
Commented
Commented
webscraping
webscraping
scrapediary something think bd apify help ps covering good octoparse
scrapediary something think bd apify help ps covering good octoparse
bd,scrapediary build,something scrapediary,scrapediary point,click write,newsletter scrapediary,mailto interface,think willing,drop stuff,interested mailto,bd
bd,scrapediary build,something scrapediary,scrapediary point,click write,newsletter scrapediary,mailto interface,think willing,drop stuff,interested mailto,bd
100
https://styles.redditmedia.com/t5_2qjyuv/styles/profileIcon_snoo64eaea5b-11cf-4805-8b7b-409b1cb17f54-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=63e75c5f28da03749c8056b7b39b59ecd3227324
anon970529
anon970529
1
2270.74829101563
5003.41015625
0
1
0
0.020481
8E-06
0.002152
0
0
92
anon970529
6/10/2020 7:48:16 AM
0
3
1
0
False
False
False
False
True
False
t2_6q2pya7y
False
False
False
https://styles.redditmedia.com/t5_2qjyuv/styles/profileIcon_snoo64eaea5b-11cf-4805-8b7b-409b1cb17f54-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=63e75c5f28da03749c8056b7b39b59ecd3227324
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/anon970529
5
0
0
0
0
0
0
5
62.5
8
Commented
Commented
webscraping
webscraping
scraping web amazaing tool octoparse
scraping web amazaing tool octoparse
octoparse,amazaing amazaing,tool web,scraping tool,web
octoparse,amazaing amazaing,tool web,scraping tool,web
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
millyfang
millyfang
8287.32901142165
2470.90063476563
5186.59912109375
7
4
660
0.027169
3.3E-05
0.003677
0
0.5
93
Millyfang
5/26/2020 1:40:54 AM
0
8
2
0
False
False
False
False
True
False
t2_6kuzpdw3
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Millyfang
5
10
1.46412884333821
3
0.439238653001464
0
0
338
49.4875549048316
683
Posted RepliedTo Commented
RepliedTo Posted Commented
webscraping scrapinghub pythontips
scrapinghub webscraping pythontips
data octoparse crawler step build yelp re extract website twitter
yelp step data twitter workflow one ## news preview pagination
octoparse,blog step,input create,workflow extract,data ##,step data,preview url,build data,excel pagination,setting yelp,crawler
##,step data,fields blog,scrape youtu,yu8vufimyze yelp,data scrape,yelp create,workflow data,preview data,excel octoparse,blog
1000
https://styles.redditmedia.com/t5_b05de/styles/profileIcon_snoo0852fdcb-ce90-45b8-8abb-1ebd1a56e05c-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5e4f3f2a112ff8b84d92a658bbbd14daf57ae57c
matty_fu
matty_fu
8262.21892350825
2896.58618164063
4989.3134765625
1
4
658
0.025602
3E-06
0.002813
0
0
94
matty_fu
2/22/2016 11:53:37 PM
0
533
6349
0
False
False
False
False
True
False
t2_vep5j
False
False
True
https://styles.redditmedia.com/t5_b05de/styles/profileIcon_snoo0852fdcb-ce90-45b8-8abb-1ebd1a56e05c-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5e4f3f2a112ff8b84d92a658bbbd14daf57ae57c
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/matty_fu
5
3
3.40909090909091
2
2.27272727272727
0
0
38
43.1818181818182
88
Commented RepliedTo
RepliedTo Commented
webscraping
webscraping
scrape easy see banned twitter help yelp send trivial octoparse
scrape easy see banned twitter help yelp send trivial octoparse
amounts,data trivial,amounts really,see video,octoparse ip,banned scraping,non around,anti without,using trying,scrape using,official
amounts,data trivial,amounts really,see video,octoparse ip,banned scraping,non around,anti without,using trying,scrape using,official
1000
https://styles.redditmedia.com/t5_b3c6z/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNmFjYjhmYjgyODgwZDM5YzJiODQ0NmY4Nzc4YTE0ZDM0ZWU2Y2ZiN180Nzk3OTY_rare_9d9ba1a1-eff2-4abf-99ec-d4295f369775-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1109c15414a72cb5b3f892f3abbb197b59e54c67
b33rnuts
b33rnuts
6780.72373661772
2679.51538085938
5067.78515625
2
1
540
0.026626
9E-06
0.002236
0
0.5
95
B33rNuts
1/29/2011 11:08:31 PM
0
8750
11433
0
False
False
False
False
True
False
t2_4s4vb
False
False
False
https://styles.redditmedia.com/t5_b3c6z/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNmFjYjhmYjgyODgwZDM5YzJiODQ0NmY4Nzc4YTE0ZDM0ZWU2Y2ZiN180Nzk3OTY_rare_9d9ba1a1-eff2-4abf-99ec-d4295f369775-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1109c15414a72cb5b3f892f3abbb197b59e54c67
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/B33rNuts
5
4
3.6697247706422
1
0.917431192660551
0
0
44
40.3669724770642
109
Commented RepliedTo
Commented RepliedTo
webscraping
webscraping
yelp simple handy scraping banned people octoparse way being find
handy scraping banned people octoparse way being find once website
results,name security,simple simple,octoparse website,business phone,number click,each information,gather being,banned such,simple yelp,much
results,name security,simple simple,octoparse website,business phone,number click,each information,gather being,banned such,simple yelp,much
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
jafirgull
jafirgull
1
1135.37414550781
5627.2861328125
1
1
0
0
0
0.002439
0
0
96
JafirGull
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/JafirGull
1
Posted
Posted
software
software
100
https://styles.redditmedia.com/t5_rs4s7/styles/profileIcon_snooc661f4a9-08c0-4e0a-b33f-8caae3fea1c2-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0b05ac565eba99abdc524db878cf603e434081a0
universecoder
universecoder
1
4013.08447265625
4484.8603515625
0
1
0
0.008431
0
0.002178
0
0
97
universecoder
11/24/2018 8:34:56 AM
0
369
1292
0
False
False
False
False
True
False
t2_1ya37xj7
False
False
True
https://styles.redditmedia.com/t5_rs4s7/styles/profileIcon_snooc661f4a9-08c0-4e0a-b33f-8caae3fea1c2-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0b05ac565eba99abdc524db878cf603e434081a0
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/universecoder
10
0
0
1
12.5
0
0
4
50
8
RepliedTo
RepliedTo
webscraping
webscraping
hit very rate limits quickly
hit very rate limits quickly
rate,limits hit,rate quickly,hit very,quickly
rate,limits hit,rate quickly,hit very,quickly
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
cl1ffhanger
cl1ffhanger
1
4126.65087890625
2392.92309570313
2
1
0
0.008431
0
0.002355
0
0
98
cl1ffhanger
12/12/2016 10:35:55 AM
0
881
1456
0
False
False
False
False
True
False
t2_13ibws
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/cl1ffhanger
10
5
5.15463917525773
0
0
0
0
43
44.3298969072165
97
Posted
Posted
webscraping
webscraping
tweet number trying replies tweets twitter appreciated responses octoparse way
tweet number trying replies tweets twitter appreciated responses octoparse way
90,minutes body,text greatly,appreciated number,replies text,tweet tweets,responses 40,tweets octoparse,90 retweets,number looking,extract
90,minutes body,text greatly,appreciated number,replies text,tweet tweets,responses 40,tweets octoparse,90 retweets,number looking,extract
100
https://styles.redditmedia.com/t5_7zxijg/styles/profileIcon_snoo33ce69d6-5b3d-40b9-bd7f-80978e65063d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4cd91862b0d4e6c5837a51b41a0a327d5379c7e4
gillesquenot
gillesquenot
1
6204.90625
9822.0849609375
1
1
0
0.006112
0
0.002173
0
1
99
GillesQuenot
3/2/2023 5:02:06 PM
0
43
123
0
False
False
False
False
True
False
t2_w29e76gz
False
False
False
https://styles.redditmedia.com/t5_7zxijg/styles/profileIcon_snoo33ce69d6-5b3d-40b9-bd7f-80978e65063d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4cd91862b0d4e6c5837a51b41a0a327d5379c7e4
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/GillesQuenot
16
2
7.14285714285714
1
3.57142857142857
0
0
9
32.1428571428571
28
Commented RepliedTo
Commented RepliedTo
webscraping
webscraping
free code feel pm help without experienced developer experiences difficult
free feel pm help without experienced developer experiences difficult code
experienced,developer pm,free experiences,code developer,feel code,difficult code,experienced feel,free difficult,help free,pm help,without
experienced,developer pm,free experiences,code developer,feel code,difficult code,experienced feel,free difficult,help free,pm help,without
228.571428571429
https://styles.redditmedia.com/t5_4zi0pl/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYjljMDQyYzMyNzViYzQ5Nzk5Njg4ZWVhMWEyOWIxNDA1ZDAyOTQ2Yl80MTg2MTM_rare_2fecc6c7-3ce0-46a9-9f89-38ac23720fff-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b255284fe386ad912b1be19abcde06652f6ad15b
majestic-dust4427
majestic-dust4427
226.990791220591
6016.26806640625
9154.55078125
5
3
18
0.010187
0
0.003328
0
0.5
100
Majestic-Dust4427
9/2/2021 12:56:03 PM
0
1051
12054
0
False
False
False
False
True
False
t2_eaj7fk8m
False
False
True
https://styles.redditmedia.com/t5_4zi0pl/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYjljMDQyYzMyNzViYzQ5Nzk5Njg4ZWVhMWEyOWIxNDA1ZDAyOTQ2Yl80MTg2MTM_rare_2fecc6c7-3ce0-46a9-9f89-38ac23720fff-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b255284fe386ad912b1be19abcde06652f6ad15b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Majestic-Dust4427
16
0
0
3
2.05479452054795
0
0
67
45.8904109589041
146
RepliedTo Posted Commented
RepliedTo Commented Posted
webscraping
webscraping
error octoparse urls sites infinite automate power used scroll getting
error urls getting wordpress scraping web octoparse sites infinite automate
power,automate infinite,scroll same,error check,selected everything,scrolled selecting,everything wordpress,read register,log urls,wordpress know,anything
power,automate infinite,scroll same,error check,selected everything,scrolled selecting,everything wordpress,read register,log urls,wordpress know,anything
100
https://styles.redditmedia.com/t5_20oexl/styles/profileIcon_snoo5d3659cb-51d3-4d89-9908-f59a77631f1f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d7887a507da287a6092f455f54faf4ad039a9635
ivanoski-007
ivanoski-007
1
5842.3232421875
9927.7822265625
1
1
0
0.006112
0
0.002173
0
1
101
ivanoski-007
11/15/2010 11:48:36 PM
0
158302
176858
0
False
False
False
False
True
False
t2_4iy5o
False
False
True
https://styles.redditmedia.com/t5_20oexl/styles/profileIcon_snoo5d3659cb-51d3-4d89-9908-f59a77631f1f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d7887a507da287a6092f455f54faf4ad039a9635
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ivanoski-007
16
0
0
0
0
0
0
3
75
4
Commented
Commented
webscraping
webscraping
python scraping try
python scraping try
try,scraping scraping,python
try,scraping scraping,python
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
ok-computer9983
ok-computer9983
1
6263.82666015625
8641.8193359375
0
1
0
0.006112
0
0.002173
0
0
102
Ok-Computer9983
4/29/2021 5:24:18 AM
0
9
-1
0
False
False
False
False
True
False
t2_bttzn9is
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Ok-Computer9983
16
1
7.69230769230769
1
7.69230769230769
0
0
4
30.7692307692308
13
Commented
Commented
webscraping
webscraping
useful discord found questions server resource
useful discord found questions server resource
useful,discord discord,server found,resource server,questions resource,useful
useful,discord discord,server found,resource server,questions resource,useful
157.142857142857
https://styles.redditmedia.com/t5_2m7vpb/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xNDQwOTI2_rare_fa740c87-90ee-4e12-a704-fa66c44e8d36-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=61aee5f16352d6b87d9a364942bef3f5e56d69d1
elitedoorhugger
elitedoorhugger
101.440351653596
5829.35791015625
8493.1181640625
1
2
8
0.007641
0
0.002355
0
0.5
103
Elitedoorhugger
4/28/2020 12:16:47 PM
0
2
92
0
False
False
False
False
True
False
t2_6bsr06v9
False
False
False
https://styles.redditmedia.com/t5_2m7vpb/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xNDQwOTI2_rare_fa740c87-90ee-4e12-a704-fa66c44e8d36-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=61aee5f16352d6b87d9a364942bef3f5e56d69d1
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Elitedoorhugger
16
0
0
0
0
0
0
42
37.8378378378378
111
Commented
Commented
webscraping
webscraping
data making paremeters using time try check scroll each loads
apply text considering sku html whole need change current req
each,time using,adding data,making time,loads loads,data try,check scroll,try infinite,scroll loading,using request,loading
code,sku data,octaparse whole,html html,data considering,whole need,change current,code req,text apply,current hi,considering
100
https://styles.redditmedia.com/t5_juxqh/styles/profileIcon_snoo9ed82801-a4f9-4bb9-843e-1c4de650396d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=79d2d52a070d08d0da19ebe55301432391301b0a
screamingismyair
screamingismyair
1
4957.1396484375
4987.3017578125
1
1
0
0.018195
0
0.002175
0
1
104
ScreamingIsMyAir
10/30/2012 8:50:32 PM
0
436
3421
0
False
False
False
False
True
False
t2_9gh71
False
False
True
https://styles.redditmedia.com/t5_juxqh/styles/profileIcon_snoo9ed82801-a4f9-4bb9-843e-1c4de650396d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=79d2d52a070d08d0da19ebe55301432391301b0a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ScreamingIsMyAir
12
7
2.04081632653061
4
1.16618075801749
0
0
155
45.1895043731778
343
RepliedTo Commented
Commented RepliedTo
learnpython
learnpython
google few json file lot learn multiple store soup mentioned
google json file multiple store soup mentioned months information beautiful
mentioned,lot json,file beautiful,soup basic,information google,store google,'how gather,know days,week lost,first boring,stuff
mentioned,lot json,file beautiful,soup basic,information google,store google,'how gather,know days,week lost,first boring,stuff
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
scrapeloser
scrapeloser
4495.70573649841
5186.0888671875
5495.2705078125
4
2
358
0.022227
0
0.002715
0
0.333333333333333
105
ScrapeLoser
5/30/2019 1:42:50 PM
0
1
0
0
False
False
False
False
True
False
t2_3uz7s19c
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ScrapeLoser
12
8
2.13903743315508
7
1.8716577540107
0
0
150
40.1069518716578
374
RepliedTo Posted
Posted RepliedTo
learnpython
learnpython
life time something know trying few #x200b create people import
trying few #x200b create import octoparse follow python situation teach
apply,life import,io appreciative,time automate,boring create,python frustrated,serious please,assume requires,scrape python,paid scrape,individual
apply,life import,io appreciative,time automate,boring create,python frustrated,serious please,assume requires,scrape python,paid scrape,individual
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
yassiiir
yassiiir
1
4808.294921875
4629.16650390625
0
1
0
0.021032
0
0.002182
0
0
106
Yassiiir
7/19/2019 7:01:41 AM
0
1
0
0
False
False
False
False
True
False
t2_46ri0rh4
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Yassiiir
12
1
4
1
4
0
0
8
32
25
RepliedTo
RepliedTo
learnpython
learnpython
product stock help know love 999 ammount tuto trick card
product stock help know love 999 ammount tuto trick card
999,card stock,product know,ammount card,trick love,help help,tuto ammount,stock trick,know tuto,999
999,card stock,product know,ammount card,trick love,help help,tuto ammount,stock trick,know tuto,999
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
commandlineluser
commandlineluser
6755.61364870432
5056.72509765625
5221.52294921875
3
2
538
0.026612
1E-06
0.002901
0
0.25
107
commandlineluser
11/1/2013 11:06:32 PM
0
1338
5251
0
False
False
False
False
True
False
t2_dqm6u
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/commandlineluser
12
4
3.27868852459016
0
0
0
0
64
52.4590163934426
122
Commented
Commented
learnprogramming learnpython
learnpython learnprogramming
td text revenue strong following ' rr tag icon match
td text revenue strong following ' rr tag icon match
following,td revenue,following rr,icon text,revenue revenue,strong strong,text td,text write,tutorial million,td strong,td
following,td revenue,following rr,icon text,revenue revenue,strong strong,text td,text write,tutorial million,td strong,td
100
https://styles.redditmedia.com/t5_49mgb6/styles/profileIcon_snoof7b96725-a225-4d5b-a516-321c08ec1251-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2e156239a3fa8a6e554f203963e0bfde2f22b630
global_divide2795
global_divide2795
1
7986.15673828125
6950.87158203125
0
1
0
0.00326
0
0.002217
0
0
108
Global_Divide2795
4/16/2021 7:20:32 PM
0
172
68
0
False
False
False
False
True
False
t2_bkx2384y
False
False
False
https://styles.redditmedia.com/t5_49mgb6/styles/profileIcon_snoof7b96725-a225-4d5b-a516-321c08ec1251-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2e156239a3fa8a6e554f203963e0bfde2f22b630
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Global_Divide2795
45
0
0
0
0
0
0
7
36.8421052631579
19
Commented
Commented
sales
sales
need see linkedin tool extract paid access
need see linkedin tool extract paid access
paid,access need,paid linkedin,tool access,linkedin tool,extract extract,see
paid,access need,paid linkedin,tool access,linkedin tool,extract extract,see
114.285714285714
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
hellletloose94
hellletloose94
26.1100879133989
8223.158203125
6309.482421875
3
1
2
0.00489
0
0.002882
0
0
109
hellletloose94
11/20/2020 11:36:34 PM
0
1501
3114
0
False
False
False
False
True
False
t2_8z2dzbgo
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/hellletloose94
45
6
3.19148936170213
1
0.531914893617021
0
0
84
44.6808510638298
188
Posted
Posted
sales
sales
sales scraper linkedin nav necessary dozens contact buster phantom create
sales scraper linkedin nav necessary dozens contact buster phantom create
sales,nav phantom,buster access,sales necessary,linkedin nav,scraping automate,workflow navigator,necessary mention,sales turn,bring scraper,platform
sales,nav phantom,buster access,sales necessary,linkedin nav,scraping automate,workflow navigator,necessary mention,sales turn,bring scraper,platform
100
https://styles.redditmedia.com/t5_3z4zdr/styles/profileIcon_snooffe64649-d0a9-4743-b9e1-e861c03f0142-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c06edc2bb4e821742d51f58fb93a084a7d03a8c9
sensitive_purchase71
sensitive_purchase71
1
8460.4638671875
5668.94873046875
0
1
0
0.00326
0
0.002217
0
0
110
Sensitive_Purchase71
2/17/2021 9:27:53 AM
0
1
5
0
False
False
False
False
True
False
t2_8wve4nbx
False
False
False
https://styles.redditmedia.com/t5_3z4zdr/styles/profileIcon_snooffe64649-d0a9-4743-b9e1-e861c03f0142-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c06edc2bb4e821742d51f58fb93a084a7d03a8c9
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Sensitive_Purchase71
45
0
0
0
0
0
0
24
52.1739130434783
46
Commented
Commented
sales
sales
tools people looking running hacking desktop two gosh growth forward
tools people looking running hacking desktop two gosh growth forward
running,desktop use,phantombuster hacking,communities phantombuster,check communities,gosh really,forward growth,hacking people,really beats,classics those,two
running,desktop use,phantombuster hacking,communities phantombuster,check communities,gosh really,forward growth,hacking people,really beats,classics those,two
142.857142857143
https://styles.redditmedia.com/t5_6dvriv/styles/profileIcon_snoo89e54848-74f0-420d-a8d6-352d1732d675-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=98aff3f966c00cf7b367346ad53569075d364f9b
lazyleadz
lazyleadz
76.3302637401968
5997.6572265625
2809.43676757813
2
2
6
0.00652
0
0.002526
0
1
111
LazyLeadz
5/17/2022 7:25:23 PM
0
19
1771
0
False
False
False
False
True
False
t2_mwh3d4cf
False
False
False
https://styles.redditmedia.com/t5_6dvriv/styles/profileIcon_snoo89e54848-74f0-420d-a8d6-352d1732d675-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=98aff3f966c00cf7b367346ad53569075d364f9b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/LazyLeadz
22
2
5
0
0
0
0
16
40
40
RepliedTo Commented
Commented RepliedTo
sales
sales
linkedin month cool thanks sharing clients outreach qualified meeting average
cool thanks sharing clients outreach qualified meeting average many automate
automate,linkedin many,meeting generating,linkedin sharing,wanna month,qualified qualified,prospects wanna,automate thanks,sharing average,month linkedin,outreach
automate,linkedin many,meeting generating,linkedin sharing,wanna month,qualified qualified,prospects wanna,automate thanks,sharing average,month linkedin,outreach
100
https://styles.redditmedia.com/t5_723dok/styles/profileIcon_snoo1ac177ee-b553-49f0-80b6-233704d48981-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=841927ed5d1b3c6fdc9202c043520851d7b94982
mrbadapple2022
mrbadapple2022
1
5653.1552734375
1844.544921875
1
1
0
0.004347
0
0.002263
0
1
112
MrBadApple2022
9/18/2022 10:37:46 AM
0
1
242
0
False
False
False
False
True
False
t2_sli98naz
False
False
False
https://styles.redditmedia.com/t5_723dok/styles/profileIcon_snoo1ac177ee-b553-49f0-80b6-233704d48981-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=841927ed5d1b3c6fdc9202c043520851d7b94982
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/MrBadApple2022
22
0
0
0
0
0
0
4
36.3636363636364
11
RepliedTo
RepliedTo
sales
sales
id interested know mind
id interested know mind
know,mind interested,know id,interested
know,mind interested,know id,interested
171.428571428571
https://styles.redditmedia.com/t5_3woctt/styles/profileIcon_snoo973c5329-8d53-45c3-a867-bb15c8b8a52d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ae68ded3cc00d389af28d337d48685c517781fff
same_paint6431
same_paint6431
126.550439566995
6263.82666015625
3546.65380859375
4
3
10
0.007824
0
0.003032
0
0.666666666666667
113
Same_Paint6431
2/5/2021 2:07:22 PM
0
951
480
0
False
False
False
False
True
False
t2_a6ei99mb
False
False
False
https://styles.redditmedia.com/t5_3woctt/styles/profileIcon_snoo973c5329-8d53-45c3-a867-bb15c8b8a52d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ae68ded3cc00d389af28d337d48685c517781fff
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Same_Paint6431
22
57
6.64335664335664
13
1.51515151515152
0
0
363
42.3076923076923
858
RepliedTo Posted
Posted RepliedTo
sales
sales
leads lex lead used call pretty selling outscraper linkedin example
lead free paid used call pretty capture google scenario selling
lex,leads pretty,much lead,generation cold,call outscraper,outscraper lead,capture unlimited,leads b2b,lead example,let's google,map
pretty,much outscraper,outscraper lead,capture b2b,lead google,map let's,selling leads,pretty phone,contact paid,version linkedin,prospecting
100
https://styles.redditmedia.com/t5_2j7659/styles/profileIcon_snood77f13e8-6eef-4e4f-952a-eca460a00a57-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=02fede411b48288bc15e03295341a8ec9c041100
bfh956
bfh956
1
6245.76708984375
3496.49755859375
0
1
0
0.00489
0
0.002187
0
0
114
bfh956
3/30/2020 8:03:15 PM
0
11
11
0
False
False
False
False
True
False
t2_62su4bt8
False
False
False
https://styles.redditmedia.com/t5_2j7659/styles/profileIcon_snood77f13e8-6eef-4e4f-952a-eca460a00a57-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=02fede411b48288bc15e03295341a8ec9c041100
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/bfh956
22
0
0
0
0
0
0
1
20
5
Commented
Commented
sales
sales
company
company
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
outlandishnessok153
outlandishnessok153
1
5913.82666015625
2575.76342773438
1
1
0
0.00489
0
0.002187
0
1
115
OutlandishnessOk153
12/21/2020 12:24:30 AM
0
128
1142
0
False
False
False
False
True
False
t2_7yck4rzh
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/OutlandishnessOk153
22
1
1.51515151515152
0
0
0
0
32
48.4848484848485
66
RepliedTo Commented
Commented RepliedTo
sales
sales
outreach b2b linkedin need email help sales meeting effective game
linkedin need email help sales meeting effective game arrange asking
need,automate meeting,please outreach,linkedin linkedin,currently currently,sales please,know email,phone effective,b2b yes,dm'd use,help
need,automate meeting,please outreach,linkedin linkedin,currently currently,sales please,know email,phone effective,b2b yes,dm'd use,help
100
https://styles.redditmedia.com/t5_en4ig/styles/profileIcon_snood7f2408c-4e1c-432b-81c7-472cd0b324a7-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8dea1bab1bed3627a47b822bf4cbfc2d28fa87d0
kramrm
kramrm
1
5791.978515625
71.2179489135742
0
1
0
0.00489
0
0.002212
0
0
116
kramrm
1/6/2011 9:42:47 PM
0
8988
5293
0
False
False
False
False
True
False
t2_4p2pv
False
False
True
https://styles.redditmedia.com/t5_en4ig/styles/profileIcon_snood7f2408c-4e1c-432b-81c7-472cd0b324a7-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8dea1bab1bed3627a47b822bf4cbfc2d28fa87d0
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/kramrm
21
0
0
0
0
0
0
4
40
10
RepliedTo
RepliedTo
learnpython
learnpython
much api depends use
much api depends use
much,use api,much depends,api
much,use api,much depends,api
171.428571428571
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
eccepiscinam
eccepiscinam
126.550439566995
5941.6484375
664.620300292969
2
2
10
0.007824
0
0.002776
0
0.333333333333333
117
eccepiscinam
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/eccepiscinam
21
1
5.55555555555556
0
0
0
0
9
50
18
Commented RepliedTo
Commented RepliedTo
learnpython
learnpython
better google probably realize api apis charges tool selenium looked
better google probably realize api apis charges tool selenium looked
probably,better google,charges realize,google charges,apis better,tool api,probably looked,api tool,selenium
probably,better google,charges realize,google charges,apis better,tool api,probably looked,api tool,selenium
100
https://styles.redditmedia.com/t5_1uf1kv/styles/profileIcon_snooc6f126bd-1a09-46a4-aa71-70d0cfeb38e2-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=37d0a028ed80f56a0387318e4953a6c70c32a5a6
lolslim
lolslim
1
5653.1552734375
1773.32690429688
0
1
0
0.004347
0
0.002265
0
0
118
lolslim
6/28/2013 3:10:48 AM
0
3835
21871
0
False
False
False
False
True
False
t2_c6vyp
False
False
False
https://styles.redditmedia.com/t5_1uf1kv/styles/profileIcon_snooc6f126bd-1a09-46a4-aa71-70d0cfeb38e2-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=37d0a028ed80f56a0387318e4953a6c70c32a5a6
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/lolslim
21
0
0
1
2.08333333333333
0
0
27
56.25
48
RepliedTo
RepliedTo
learnpython
learnpython
scrape dont go working area community osm grabbing credit download
scrape dont go working area community osm grabbing credit download
working,grabbing unless,go provide,dont metro,area go,over download,osm area,pokemon local,community file,country bot,local
working,grabbing unless,go provide,dont metro,area go,over download,osm area,pokemon local,community file,country bot,local
142.857142857143
https://styles.redditmedia.com/t5_7tgr9/styles/profileIcon_snoo68c9221a-8bf2-4a8b-9f11-4cc3bfe031ca-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7d2e70ab5ca98cfc30593f285e7eaf8a476e6313
omar_88
omar_88
76.3302637401968
5790.38623046875
1229.80236816406
2
1
6
0.00652
0
0.002552
0
0.5
119
Omar_88
7/18/2017 5:01:14 PM
0
1510
3493
0
False
False
False
False
True
False
t2_7iuhjtv
False
False
False
https://styles.redditmedia.com/t5_7tgr9/styles/profileIcon_snoo68c9221a-8bf2-4a8b-9f11-4cc3bfe031ca-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7d2e70ab5ca98cfc30593f285e7eaf8a476e6313
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Omar_88
21
2
20
1
10
0
0
4
40
10
RepliedTo
RepliedTo
learnpython
learnpython
pricey web worth yes very well apps
pricey web worth yes very well apps
worth,web well,worth very,pricey yes,very web,apps pricey,well
worth,web well,worth very,pricey yes,very web,apps pricey,well
100
https://styles.redditmedia.com/t5_21tfrn/styles/profileIcon_snoo0fe70006-ad9e-4826-bc7e-2db6ee41968f-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2313931e2ec8ddb4b1189436a0b4f0b53f32d9a7
rahul_desai1999
rahul_desai1999
1
6263.82666015625
608.969970703125
2
1
0
0.00489
0
0.002391
0
0
120
Rahul_Desai1999
7/23/2019 5:57:15 AM
0
2068
262
0
False
False
False
False
True
False
t2_3rx7qpqf
False
False
False
https://styles.redditmedia.com/t5_21tfrn/styles/profileIcon_snoo0fe70006-ad9e-4826-bc7e-2db6ee41968f-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2313931e2ec8ddb4b1189436a0b4f0b53f32d9a7
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Rahul_Desai1999
21
1
1.23456790123457
2
2.46913580246914
0
0
29
35.8024691358025
81
Posted
Posted
learnpython
learnpython
selenium same using tried hard email octoparse etc continue forever
selenium same using tried hard email octoparse etc continue forever
using,selenium tried,using same,tried forever,ashamed attribute,value selenium,first class,name tool,octoparse turns,pay name,infact
using,selenium tried,using same,tried forever,ashamed attribute,value selenium,first class,name tool,octoparse turns,pay name,infact
185.714285714286
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
rawrtherapy
rawrtherapy
151.660527480394
9613.625
9197.7978515625
5
4
12
0.00978
0
0.003484
0
0.75
121
rawrtherapy
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/rawrtherapy
20
4
2
0
0
0
0
79
39.5
200
RepliedTo Posted
Posted RepliedTo
FulfillmentByAmazon
FulfillmentByAmazon
data octoparse software day mine sign month yeah compare week
software day sign month compare week mine data octoparse yeah
data,mine dates,reputable left,day guys,know month,month yeah,actually anyone,insight day,day mean,public done,day
data,mine dates,reputable left,day guys,know month,month yeah,actually anyone,insight day,day mean,public done,day
100
https://styles.redditmedia.com/t5_em3q3/styles/profileIcon_snoo1a4d441b-1cab-4372-b057-4d9d27b0ae32-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2866db25d87e3dcd8e00fdabb90f4c1bce40a814
vrjain
vrjain
1
9777.14453125
8467.814453125
1
1
0
0.005589
0
0.002178
0
1
122
vrjain
8/6/2014 10:37:59 AM
0
381
113
0
False
False
False
False
True
False
t2_hqzoe
False
False
False
https://styles.redditmedia.com/t5_em3q3/styles/profileIcon_snoo1a4d441b-1cab-4372-b057-4d9d27b0ae32-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2866db25d87e3dcd8e00fdabb90f4c1bce40a814
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/vrjain
20
0
0
0
0
0
0
12
48
25
Commented
Commented
FulfillmentByAmazon
FulfillmentByAmazon
setup pushed something built scraper pull tool getting database interested
setup pushed something built scraper pull tool getting database interested
setup,something setup,scraper data,pushed pull,data tool,built scraper,pull interested,getting database,setup getting,tool pushed,database
setup,something setup,scraper data,pushed pull,data tool,built scraper,pull interested,getting database,setup getting,tool pushed,database
100
https://styles.redditmedia.com/t5_cjva1/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV81MDIwMjY_rare_c0942272-712d-447b-b572-cb5739e73158-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=608401aa8247055a0cd880ebca968dc0ba080332
resoluter08
resoluter08
1
9450.1044921875
9927.7822265625
1
1
0
0.005589
0
0.002178
0
1
123
resoluter08
10/19/2014 2:27:59 PM
0
56
1291
0
False
False
False
False
True
False
t2_ixt3i
False
False
False
https://styles.redditmedia.com/t5_cjva1/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV81MDIwMjY_rare_c0942272-712d-447b-b572-cb5739e73158-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=608401aa8247055a0cd880ebca968dc0ba080332
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/resoluter08
20
0
0
0
0
0
0
8
44.4444444444444
18
Commented
Commented
FulfillmentByAmazon
FulfillmentByAmazon
reason exist tos tool getting within way data
reason exist tos tool getting within way data
within,tos way,getting getting,data tool,exist reason,tool data,within exist,way
within,tos way,getting getting,data tool,exist reason,tool data,within exist,way
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
jordanwilson23
jordanwilson23
1
9969.35546875
9533.3515625
0
1
0
0.005589
0
0.002178
0
0
124
jordanwilson23
11/23/2013 3:01:38 AM
0
3154
15210
0
False
False
False
False
True
False
t2_e1al8
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/jordanwilson23
20
0
0
0
0
0
0
14
58.3333333333333
24
Commented
Commented
FulfillmentByAmazon
FulfillmentByAmazon
reviews record time customer performance period always daily dashboard choose
reviews record time customer performance period always daily dashboard choose
time,period record,daily daily,always dashboard,customer customer,reviews choose,time always,reviews period,record brand,dashboard previous,day
time,period record,daily daily,always dashboard,customer customer,reviews choose,time always,reviews period,record brand,dashboard previous,day
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
oldschoolvalue
oldschoolvalue
1
9257.8935546875
8862.244140625
1
1
0
0.005589
0
0.002178
0
1
125
oldschoolvalue
9/30/2012 4:44:39 PM
0
802
580
0
False
False
False
False
True
False
t2_963uz
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/oldschoolvalue
20
1
2.94117647058824
0
0
0
0
15
44.1176470588235
34
Commented
Commented
FulfillmentByAmazon
FulfillmentByAmazon
dates exact api whether see aggregating week except scrape everything
dates exact api whether see aggregating week except scrape everything
everything,except exact,last worth,implementing aggregating,dates see,whether ask,exact scrape,offered someone,ask except,aggregating dates,someone
everything,except exact,last worth,implementing aggregating,dates see,whether ask,exact scrape,offered someone,ask except,aggregating dates,someone
100
https://styles.redditmedia.com/t5_7o7ds7/styles/profileIcon_snoo78b209fe-03c9-4ed9-bf2c-af3f11a1b8e9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e05efabab2ff8037b9dd06494aeb92c6cb996ea8
snellfarfar
snellfarfar
1
5653.1552734375
7869.58349609375
2
2
0
0.005094
0
0.002432
0
1
126
Snellfarfar
12/28/2022 1:19:17 PM
0
2
0
0
False
False
False
False
True
False
t2_m2fzb61x
False
False
False
https://styles.redditmedia.com/t5_7o7ds7/styles/profileIcon_snoo78b209fe-03c9-4ed9-bf2c-af3f11a1b8e9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e05efabab2ff8037b9dd06494aeb92c6cb996ea8
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Snellfarfar
16
5
2.92397660818713
0
0
0
0
68
39.766081871345
171
Posted RepliedTo
RepliedTo Posted
webscraping
webscraping
sku number id help data extract capsules scripting know using
sku number id help data extract capsules scripting know nespresso
sku,number nespresso,order id,product capsules,vertuo order,capsules extract,data extract,sku site,nespresso help,right visible,put
sku,number nespresso,order id,product capsules,vertuo order,capsules extract,data extract,sku site,nespresso help,right visible,put
142.857142857143
https://styles.redditmedia.com/t5_egvw4/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8yMzIxOTM_rare_e0b69898-c244-497e-ad2d-3be105a192cc-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9465c1186595ba829ec494bb7c9c0c9ae7ff83ff
lordcrumpets
lordcrumpets
76.3302637401968
6525.33056640625
5504.50048828125
4
3
6
0.007335
0
0.003179
0
0.666666666666667
127
LordCrumpets
6/20/2016 9:12:57 PM
0
23773
24170
0
False
False
False
False
True
False
t2_yw8di
False
False
False
https://styles.redditmedia.com/t5_egvw4/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8yMzIxOTM_rare_e0b69898-c244-497e-ad2d-3be105a192cc-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9465c1186595ba829ec494bb7c9c0c9ae7ff83ff
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/LordCrumpets
34
8
6.45161290322581
0
0
0
0
45
36.2903225806452
124
RepliedTo Posted
Posted RepliedTo
Automate
Automate
program script go something best sale need ideas extensions octoparse
program go something best sale need ideas extensions octoparse way
bikes,scrape octoparse,nothing route,imagine something,sale scrape,available best,way need,something maybe,once exactly,code two,grand
bikes,scrape octoparse,nothing route,imagine something,sale scrape,available best,way need,something maybe,once exactly,code two,grand
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
geminii27
geminii27
1
6449.52490234375
6395.37158203125
1
1
0
0.004401
0
0.002192
0
1
128
Geminii27
3/2/2011 6:07:19 AM
0
2829
1283293
0
False
False
False
False
True
False
t2_4wrg1
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Geminii27
34
1
25
0
0
0
0
2
50
4
RepliedTo
RepliedTo
Automate
Automate
two sure grand
two sure grand
two,grand sure,two
two,grand sure,two
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
stormkrieg
stormkrieg
1
6832.99609375
5246.8212890625
1
1
0
0.004401
0
0.002192
0
1
129
Stormkrieg
2/15/2015 6:43:19 PM
0
226
8031
0
False
False
False
False
True
False
t2_ldh6b
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Stormkrieg
34
3
1.2
2
0.8
0
0
102
40.8
250
Commented RepliedTo
Commented RepliedTo
Automate
Automate
re use solution code automation one automate minimum data low
use code automate minimum data low tool tools see api
low,code curl,otherwise tool,job solution,expect request,server start,automate projects,custom call,mac first,analyze specific,requirements
low,code curl,otherwise tool,job solution,expect request,server start,automate projects,custom call,mac first,analyze specific,requirements
100
https://styles.redditmedia.com/t5_6revjf/styles/profileIcon_snoocbcae6cd-0660-4353-9403-019cbf190198-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=67a2ccf7b519611bf56fb594ff6ddfc49d551d71
teleworksolutions
teleworksolutions
1
6293.470703125
4871.3076171875
0
1
0
0.004401
0
0.002192
0
0
130
TeleworkSolutions
7/25/2022 1:50:10 AM
0
6
37
0
False
False
False
False
True
False
t2_pyyiammr
False
False
False
https://styles.redditmedia.com/t5_6revjf/styles/profileIcon_snoocbcae6cd-0660-4353-9403-019cbf190198-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=67a2ccf7b519611bf56fb594ff6ddfc49d551d71
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/TeleworkSolutions
34
0
0
0
0
0
0
2
25
8
Commented
Commented
Automate
Automate
pm 000
pm 000
000,pm
000,pm
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
sure-series8740
sure-series8740
1
1451.29711914063
5627.2861328125
1
1
0
0
0
0.002439
0
0
131
Sure-Series8740
4/24/2023 6:16:16 AM
0
1
0
0
False
False
False
False
True
False
t2_9vsn47yve
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Sure-Series8740
1
55
5.32945736434108
5
0.484496124031008
0
0
575
55.7170542635659
1032
Posted
Posted
u_Sure-Series8740
u_Sure-Series8740
data scraping web amazon tool technology information help scraper marketing
data scraping web amazon tool technology information help scraper marketing
web,scraping web,data amazon,scraping amazon,data data,extraction scraping,tool auto,detection informed,decisions marketing,automation scraping,help
web,scraping web,data amazon,scraping amazon,data data,extraction scraping,tool auto,detection informed,decisions marketing,automation scraping,help
100
https://styles.redditmedia.com/t5_32vr22/styles/profileIcon_snoo695b55e3-9741-464e-b189-6cfa6c570eb1-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=54b79227a77d1caa7ff1010a1ae7a713956d7778
sexiestboomer
sexiestboomer
1
4778.650390625
4650.5322265625
1
1
0
0.006112
0
0.002206
0
1
132
SexiestBoomer
9/5/2020 1:01:17 PM
0
4005
3587
0
False
False
False
False
True
False
t2_7zcixszd
False
False
False
https://styles.redditmedia.com/t5_32vr22/styles/profileIcon_snoo695b55e3-9741-464e-b189-6cfa6c570eb1-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=54b79227a77d1caa7ff1010a1ae7a713956d7778
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/SexiestBoomer
10
2
5
0
0
0
0
15
37.5
40
Commented RepliedTo
Commented RepliedTo
webscraping
webscraping
tracker scraping links api seems look edit work greatly case
tracker scraping links api seems look edit work greatly case
look,tracker once,list suggest,look guide,scraping seems,edit tracker,gg links,suggest reply,guide greatly,simplify edit,case
look,tracker once,list suggest,look guide,scraping seems,edit tracker,gg links,suggest reply,guide greatly,simplify edit,case
228.571428571429
https://styles.redditmedia.com/t5_21xgy1/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xNzgxMTg0_rare_1c8793f3-5a57-4b61-9fdd-ccfcdaef4ace-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b0cd7aba8851a764b2e1249d25a405732ec6dd29
childishlamino
childishlamino
226.990791220591
4621.09912109375
4353.76416015625
3
3
18
0.007887
0
0.002663
0
1
133
childishlamino
7/25/2019 12:08:21 AM
0
1186
283
0
False
False
False
False
True
False
t2_487zviim
False
True
False
https://styles.redditmedia.com/t5_21xgy1/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xNzgxMTg0_rare_1c8793f3-5a57-4b61-9fdd-ccfcdaef4ace-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b0cd7aba8851a764b2e1249d25a405732ec6dd29
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/childishlamino
10
7
2.86885245901639
3
1.22950819672131
0
0
143
58.6065573770492
244
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
get_text player find tracker trying rating data players wins strip
get_text player find tracker trying rating data players wins strip
get_text,strip 6da3,7276 'class','trn leaderboards,ranked result,'link' default,act player,find 4244,6da3 act,4cb622e1 gg,valorant
get_text,strip 6da3,7276 'class','trn leaderboards,ranked result,'link' default,act player,find 4244,6da3 act,4cb622e1 gg,valorant
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
interesting-winforms
interesting-winforms
1
611.575500488281
133.829238891602
1
1
0
0.024
1E-06
0.002144
0
1
134
Interesting-Winforms
6/17/2021 10:15:23 AM
0
1
1
0
False
False
False
False
True
False
t2_crx42y4j
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Interesting-Winforms
2
6
3.7037037037037
1
0.617283950617284
0
0
78
48.1481481481481
162
Commented
Commented
learnprogramming
learnprogramming
octoparse data google maps tutorial tools tool url scrape extract
octoparse data google maps tutorial tools tool url scrape extract
data,google google,maps octoparse,tutorial tutorial,scrape scrape,data easy,extract extract,large more,features check,octoparse octoparse,excellent
data,google google,maps octoparse,tutorial tutorial,scrape scrape,data easy,extract extract,large more,features check,octoparse octoparse,excellent
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
botcra
botcra
1
2913.69897460938
4668.86279296875
2
1
0
0.019578
1E-06
0.002333
0
0
135
botcra
3/24/2015 7:37:49 AM
0
517
100
0
False
False
False
False
True
False
t2_mgcys
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/botcra
5
0
0
2
2.85714285714286
0
0
39
55.7142857142857
70
Posted
Posted
webscraping
webscraping
file sharing google issue usp view help drive listings hls_g
file sharing google issue usp view help drive listings hls_g
google,file usp,sharing view,usp drive,google takes,images help,scraping agents,listings hello,wondering x,path path,issue
google,file usp,sharing view,usp drive,google takes,images help,scraping agents,listings hello,wondering x,path path,issue
100
https://styles.redditmedia.com/t5_1lkcft/styles/profileIcon_snoo8db414fa-2958-4cad-93e3-47b0dd73e644-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d7cfc239ac3a9b343f3d19d53921f3c5d6c5d41d
chevignon93
chevignon93
1
2318.01416015625
4527.0986328125
0
1
0
0.020481
8E-06
0.002152
0
0
136
chevignon93
4/9/2015 10:45:53 PM
0
1
1950
0
False
False
False
False
True
False
t2_mta36
False
False
False
https://styles.redditmedia.com/t5_1lkcft/styles/profileIcon_snoo8db414fa-2958-4cad-93e3-47b0dd73e644-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d7cfc239ac3a9b343f3d19d53921f3c5d6c5d41d
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/chevignon93
5
0
0
0
0
0
0
2
40
5
RepliedTo
RepliedTo
webscraping
webscraping
salesman sound
salesman sound
sound,salesman
sound,salesman
100
https://styles.redditmedia.com/t5_cjonq/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYmZkNjcwNjY3MDUzZTUxN2E5N2FmZTU2YzkxZTRmODNmMTE2MGJkM181ODM0_rare_e7cec16a-3424-4a49-b200-992f988c603f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0c32468326284f3033c2445903c948653a2c7098
tusharg19
tusharg19
1
2370.19677734375
5962.2607421875
1
1
0
0.020481
8E-06
0.002152
0
1
137
tusharg19
8/22/2016 1:56:55 PM
0
1978
772
0
False
False
False
False
True
False
t2_10scze
False
False
True
https://styles.redditmedia.com/t5_cjonq/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYmZkNjcwNjY3MDUzZTUxN2E5N2FmZTU2YzkxZTRmODNmMTE2MGJkM181ODM0_rare_e7cec16a-3424-4a49-b200-992f988c603f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0c32468326284f3033c2445903c948653a2c7098
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/tusharg19
5
1
4.34782608695652
0
0
0
0
10
43.4782608695652
23
RepliedTo
RepliedTo
webscraping
webscraping
working check vba work scrape msg website wanted tried pls
working check vba work scrape msg website wanted tried pls
work,tried wanted,scrape working,vba msg,pls pls,check tried,working scrape,news news,website website,work
work,tried wanted,scrape working,vba msg,pls pls,check tried,working scrape,news news,website website,work
100
https://styles.redditmedia.com/t5_cj3p8/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180OTM2NDY4_rare_25bfd556-1d9f-4132-a0c9-e5208f332c4b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1acd3a5629f0aa4dff7febd15936e7add1e84928
lostnfoundaround
lostnfoundaround
1
2290.37622070313
5544.73486328125
1
1
0
0.020481
8E-06
0.002152
0
1
138
lostnfoundaround
7/12/2013 3:58:20 AM
0
737
9997
0
False
False
False
False
True
False
t2_cckjv
False
False
True
https://styles.redditmedia.com/t5_cj3p8/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180OTM2NDY4_rare_25bfd556-1d9f-4132-a0c9-e5208f332c4b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1acd3a5629f0aa4dff7febd15936e7add1e84928
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/lostnfoundaround
5
0
0
0
0
0
0
1
25
4
RepliedTo
RepliedTo
webscraping
webscraping
make
make
1000
https://styles.redditmedia.com/t5_2jum7n/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xNzYxMTI_rare_1a74e46e-2f53-4994-8529-b4c2d17ca3bc-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=10a591f594a1baab3f3d1caed6262af47160ad99
androidepsicokiller
androidepsicokiller
7609.35663775988
3968.1904296875
8851.419921875
9
5
606
0.025119
5.7E-05
0.004397
0.0138888888888889
0.333333333333333
139
AndroidePsicokiller
4/5/2020 10:50:33 PM
0
392
582
0
False
False
False
False
True
False
t2_64tza4m8
False
False
True
https://styles.redditmedia.com/t5_2jum7n/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xNzYxMTI_rare_1a74e46e-2f53-4994-8529-b4c2d17ca3bc-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=10a591f594a1baab3f3d1caed6262af47160ad99
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/AndroidePsicokiller
6
7
2.97872340425532
6
2.5531914893617
0
0
115
48.936170212766
235
RepliedTo Commented Posted
Commented RepliedTo Posted
webscraping
webscraping
scrapy octoparse use scrap thanks apify solutions try well scraping
scrapy octoparse apify really web scrap thanks solutions try well
scrapestack,apify 300,usd world,tutorial task,really apify,web manufacturer,supplier use,software scraping,thoguht really,well thoguht,set
scrapestack,apify 300,usd world,tutorial task,really apify,web manufacturer,supplier use,software scraping,thoguht really,well thoguht,set
100
https://styles.redditmedia.com/t5_35jiwz/styles/profileIcon_snood2c2274c-00f9-4796-b618-6a009da9be26-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f3fc92c6197e7b32fdb9202bec802b36f57fe63e
spank_engine
spank_engine
1
6569.23828125
4800.08984375
0
1
0
0.004401
0
0.002192
0
0
140
Spank_Engine
9/21/2020 10:49:37 PM
0
27
4200
0
False
False
False
False
True
False
t2_7il6xnfx
False
False
False
https://styles.redditmedia.com/t5_35jiwz/styles/profileIcon_snood2c2274c-00f9-4796-b618-6a009da9be26-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f3fc92c6197e7b32fdb9202bec802b36f57fe63e
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Spank_Engine
33
0
0
0
0
0
0
9
37.5
24
RepliedTo
RepliedTo
webdev
webdev
video chapter scraping automatetheboringstuff web youtube go edit note
video chapter scraping automatetheboringstuff web youtube go edit note
youtube,video chapter,web go,automatetheboringstuff scraping,edit note,youtube automatetheboringstuff,chapter edit,note web,scraping
youtube,video chapter,web go,automatetheboringstuff scraping,edit note,youtube automatetheboringstuff,chapter edit,note web,scraping
100
https://styles.redditmedia.com/t5_13q81c/styles/profileIcon_snoo5d0c4a9e-68ed-407c-b1e9-b5b6a5271e25-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=122c0d587644a4df9673a45d90406909a8348587
mud002
mud002
1
6832.99609375
3268.90380859375
1
1
0
0.004401
0
0.002192
0
1
141
mud002
6/17/2019 8:01:21 PM
0
414
15112
0
False
False
False
False
True
False
t2_2fzcdj82
False
False
False
https://styles.redditmedia.com/t5_13q81c/styles/profileIcon_snoo5d0c4a9e-68ed-407c-b1e9-b5b6a5271e25-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=122c0d587644a4df9673a45d90406909a8348587
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/mud002
33
4
3.2
0
0
0
0
52
41.6
125
Commented RepliedTo
Commented RepliedTo
webdev
webdev
use python html basically need look grab webscraping library kind
use python html basically need look grab webscraping library kind
recommend,python environment,setup database,populate python,lots those,steps usually,id use,class python,script look,webscraping populate,html
recommend,python environment,setup database,populate python,lots those,steps usually,id use,class python,script look,webscraping populate,html
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
andre380
andre380
1
187.605651855469
6288.90087890625
1
1
0
0
0
0.002439
0
0
142
Andre380
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/Andre380
1
0
0
0
0
0
0
2
100
2
Posted
Posted
scrapinghub scrapingtheweb
scrapinghub scrapingtheweb
removed
removed
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
carlpaul153
carlpaul153
3943.28380240363
1896.54614257813
1387.78674316406
7
5
314
0.023148
1E-06
0.003732
0
0.666666666666667
143
carlpaul153
4/28/2020 12:15:42 PM
0
1903
1413
0
False
False
False
False
True
False
t2_6bsqotsc
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/carlpaul153
2
29
6.83962264150943
4
0.943396226415094
0
0
186
43.8679245283019
424
Posted RepliedTo Commented
RepliedTo Posted Commented
webscraping datascience ecommerce learnprogramming dropship
webscraping datascience ecommerce dropship learnprogramming
io octoparse parsehub advanced mozenda very import dexi web removed
io advanced octoparse very web parsehub mozenda scraping import dexi
dexi,io import,io web,scraping signup,re discount,until io,dexi re,fudal6iu octoparse,octoparse mozenda,mozenda june,25
web,scraping dexi,io import,io powerful,advanced advanced,options programming,skills signup,re discount,until io,dexi re,fudal6iu
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
solalliance
solalliance
1
2224.83862304688
1188.42431640625
1
1
0
0.018808
0
0.002153
0
1
144
SolAlliance
7/9/2019 6:13:19 PM
0
939
1584
0
False
False
False
False
True
False
t2_3tf676v8
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/SolAlliance
2
1
16.6666666666667
0
0
0
0
1
16.6666666666667
6
Commented
Commented
webscraping
webscraping
write thank
write thank
thank,write
thank,write
1000
https://styles.redditmedia.com/t5_36suy3/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xODMyODU_rare_ca2b9ccf-5b0c-411e-8938-affa44121095-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a1492d93a4a066fb53b6f210e0bb28036e1abebe
gidoneli
gidoneli
3240.20134082846
1382.90295410156
1909.14709472656
1
3
258
0.026979
1E-06
0.002396
0
0.333333333333333
145
Gidoneli
9/30/2020 8:07:12 AM
0
404
125
0
False
False
False
False
True
False
t2_8aff3if1
False
False
False
https://styles.redditmedia.com/t5_36suy3/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xODMyODU_rare_ca2b9ccf-5b0c-411e-8938-affa44121095-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a1492d93a4a066fb53b6f210e0bb28036e1abebe
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Gidoneli
2
7
3.72340425531915
1
0.531914893617021
0
0
97
51.5957446808511
188
Commented RepliedTo
RepliedTo Commented
webscraping WebDeveloper
WebDeveloper webscraping
website brightdata proxy vitariz data rotating io residential grsm one
website one user crawler proxy data rotating residential dca using
rotating,residential brightdata,grsm io,vitariz grsm,io vitariz,dca recommend,data using,rotating collector,brightdata data,collector crawler,using
rotating,residential vitariz,dca recommend,data using,rotating collector,brightdata data,collector crawler,using geo,blocks booking,recommend better,specs
714.285714285714
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
propilmetil
propilmetil
1080.73378027615
1746.24060058594
1781.22412109375
3
3
86
0.023148
0
0.002393
0
1
146
propilmetil
6/28/2021 7:32:03 AM
0
2
0
0
False
False
False
False
True
False
t2_czdjpnvs
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/propilmetil
2
3
1.71428571428571
0
0
0
0
72
41.1428571428571
175
Commented Posted RepliedTo
Commented RepliedTo Posted
webscraping
webscraping
trying extract ciab watch setelsperpage 50 php data 7i5o53sz6dy registrar
ciab watch setelsperpage 50 php 7i5o53sz6dy registrar action youtube bg
youtube,watch php,action watch,7i5o53sz6dy setelsperpage,50 public,ciab bg,index registrar,setelsperpage ciab,bg index,php extract,data
youtube,watch php,action watch,7i5o53sz6dy setelsperpage,50 public,ciab bg,index registrar,setelsperpage ciab,bg index,php action,registrar
100
https://styles.redditmedia.com/t5_3tt45f/styles/profileIcon_snoo3da2aca6-a370-4b61-b72a-f4fbb2affa56-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ccc18771937720146a208e759d3d1a9a7adfcf4b
justinserpapi
justinserpapi
1
2194.638671875
1764.02990722656
1
1
0
0.018808
0
0.002153
0
1
147
justinSerpApi
1/28/2021 11:31:48 PM
0
5
7
0
False
False
False
False
True
False
t2_a1p0x8t7
False
False
True
https://styles.redditmedia.com/t5_3tt45f/styles/profileIcon_snoo3da2aca6-a370-4b61-b72a-f4fbb2affa56-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ccc18771937720146a208e759d3d1a9a7adfcf4b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/justinSerpApi
2
2
2.98507462686567
0
0
0
0
37
55.2238805970149
67
Commented
Commented
webscraping
webscraping
serpapi playground scrape google disclaimer result engine walmart looking ebay
serpapi playground scrape google disclaimer result engine walmart looking ebay
serpapi,playground options,scrape many,search youtube,walmart google,baidu area,shows scrape,serpapi serpapi,handles search,engine result,example
serpapi,playground options,scrape many,search youtube,walmart google,baidu area,shows scrape,serpapi serpapi,handles search,engine result,example
100
https://styles.redditmedia.com/t5_1srkuh/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18zMTEzOTQ_rare_b9d174af-3187-4ac6-944f-fb9810b51046-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f35b0b6a9fb639b0c304e4561f4b38c775a8c236
eranthius
eranthius
1
2241.10400390625
1479.9189453125
0
1
0
0.018808
0
0.002153
0
0
148
Eranthius
12/3/2013 4:07:32 AM
0
2521
1049
0
False
False
False
False
True
False
t2_e6ac5
False
False
False
https://styles.redditmedia.com/t5_1srkuh/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18zMTEzOTQ_rare_b9d174af-3187-4ac6-944f-fb9810b51046-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f35b0b6a9fb639b0c304e4561f4b38c775a8c236
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Eranthius
2
2
5
0
0
0
0
14
35
40
Commented
Commented
webscraping
webscraping
need mind awesome later direction write step dms help hit
need mind awesome later direction write step dms help hit
later,need direction,need mind,hit right,direction write,need thanks,awesome need,data saved,thanks step,right data,step
later,need direction,need mind,hit right,direction write,need thanks,awesome need,data saved,thanks step,right data,step
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
regular-exchange8376
regular-exchange8376
1
8968.8623046875
3687.30908203125
0
1
0
0.002445
0
0.002439
0
0
149
Regular-Exchange8376
4/8/2021 12:06:00 AM
0
94
7247
0
False
False
False
False
True
False
t2_bct8myr6
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Regular-Exchange8376
68
1
4.54545454545455
0
0
0
0
15
68.1818181818182
22
RepliedTo
RepliedTo
Quebec
Quebec
c'est top crisse chiant lamp soit peux mais encore partout
c'est top crisse chiant lamp soit peux mais encore partout
vieux,c'est peux,croire c'est,vieux c'est,chiant mais,crisse soit,top honnête,peux cinq,c'est lamp,soit encore,partout
vieux,c'est peux,croire c'est,vieux c'est,chiant mais,crisse soit,top honnête,peux cinq,c'est lamp,soit encore,partout
100
https://styles.redditmedia.com/t5_ebig9/styles/profileIcon_snoodbae900d-c741-45cd-8dca-ee17b67a897f-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6137a38b59ef24942cbb58d470439f190a606f6d
waptaff
waptaff
1
8968.8623046875
4210.76123046875
1
0
0
0.002445
0
0.002439
0
0
150
waptaff
5/19/2016 12:20:37 AM
0
642
40077
0
False
False
False
False
True
False
t2_y1d72
False
False
False
https://styles.redditmedia.com/t5_ebig9/styles/profileIcon_snoodbae900d-c741-45cd-8dca-ee17b67a897f-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6137a38b59ef24942cbb58d470439f190a606f6d
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/waptaff
68
100
https://styles.redditmedia.com/t5_ce3hw/styles/profileIcon_snoo64d3b4d7-e15d-4ca3-97f0-f35bf99ff4fd-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e60e41371d0ed0a005ae18f3d67e60c6f2d65c76
kornikopic
kornikopic
1
8564.2197265625
3687.30908203125
0
1
0
0.002445
0
0.002439
0
0
151
kornikopic
6/27/2011 11:11:56 PM
0
1738
8607
0
False
False
False
False
True
False
t2_5fp4d
False
False
False
https://styles.redditmedia.com/t5_ce3hw/styles/profileIcon_snoo64d3b4d7-e15d-4ca3-97f0-f35bf99ff4fd-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e60e41371d0ed0a005ae18f3d67e60c6f2d65c76
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/kornikopic
67
0
0
0
0
0
0
2
100
2
RepliedTo
RepliedTo
Quebec
Quebec
consistant constant
consistant constant
consistant,constant
consistant,constant
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
cppislife
cppislife
1
8564.2197265625
4210.76123046875
1
0
0
0.002445
0
0.002439
0
0
152
CppIsLife
9/23/2020 1:14:23 PM
0
263
5861
0
False
False
False
False
True
False
t2_87b6cil4
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/CppIsLife
67
100
https://styles.redditmedia.com/t5_lwwx1/styles/profileIcon_snoo1c9aaceb-d83c-4fd6-be57-fd43b2bace4a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e3d6737f93bcd66cc84e5cdceb18cb5a3ccde741
cdash04
cdash04
1
8968.8623046875
4807.21142578125
0
1
0
0.002445
0
0.002439
0
0
153
cdash04
7/17/2018 3:03:44 PM
0
285
2209
0
False
False
False
False
True
False
t2_1smxtym1
False
False
False
https://styles.redditmedia.com/t5_lwwx1/styles/profileIcon_snoo1c9aaceb-d83c-4fd6-be57-fd43b2bace4a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e3d6737f93bcd66cc84e5cdceb18cb5a3ccde741
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/cdash04
66
1
1.36986301369863
0
0
0
0
34
46.5753424657534
73
RepliedTo
RepliedTo
Quebec
Quebec
typescript javascript donc ce superset ts objective séparer langage employeurs
typescript javascript donc ce superset ts objective séparer langage employeurs
certains,employeurs typescript,nécessairement donc,oui spécifiquement,développeurs engagent,spécifiquement existe,plusieurs plusieurs,autres dit,donc objective,certains superset,pourtant
certains,employeurs typescript,nécessairement donc,oui spécifiquement,développeurs engagent,spécifiquement existe,plusieurs plusieurs,autres dit,donc objective,certains superset,pourtant
100
https://styles.redditmedia.com/t5_e67h2/styles/profileIcon_snoo55e8f058-08c2-47f8-b458-bfec742ffbc3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3ba237ab8ba4064ae63baae704e49493ef33c745
camelonn
camelonn
1
8968.8623046875
5334.22412109375
1
0
0
0.002445
0
0.002439
0
0
154
Camelonn
7/4/2015 4:14:03 PM
0
1345
6657
0
False
False
False
False
True
False
t2_oj80p
False
False
False
https://styles.redditmedia.com/t5_e67h2/styles/profileIcon_snoo55e8f058-08c2-47f8-b458-bfec742ffbc3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3ba237ab8ba4064ae63baae704e49493ef33c745
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Camelonn
66
100
https://styles.redditmedia.com/t5_hsv0c/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180NDMyNzc4_rare_3feadea3-0fd6-421c-aaf7-43c30352997c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=16362e1c45d306c932065e9df7e07422a422237a
vote4flipflop
vote4flipflop
1
8750.9775390625
5597.73095703125
1
1
0
0.002445
0
0.002439
0
1
155
Vote4flipflop
4/8/2018 2:03:54 AM
0
651
2507
0
False
False
False
False
True
False
t2_15radc6w
False
False
False
https://styles.redditmedia.com/t5_hsv0c/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180NDMyNzc4_rare_3feadea3-0fd6-421c-aaf7-43c30352997c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=16362e1c45d306c932065e9df7e07422a422237a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Vote4flipflop
65
0
0
2
3.77358490566038
0
0
23
43.3962264150943
53
RepliedTo
RepliedTo
webscraping
webscraping
usually few specific engine write don scrape interact sites going
usually few specific engine write don scrape interact sites going
engine,mods search,engine stuff,scratch don,know mods,scrape 5000,sites depends,language interact,usually write,stuff language,re
engine,mods search,engine stuff,scratch don,know mods,scrape 5000,sites depends,language interact,usually write,stuff language,re
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
iam_the_analyst
iam_the_analyst
1
8377.4599609375
4543.705078125
1
1
0
0.002445
0
0.002439
0
1
156
Iam_the_analyst
1/15/2020 5:21:00 PM
0
114
117
0
False
False
False
False
True
False
t2_5gb4qpo2
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Iam_the_analyst
65
10
6.99300699300699
2
1.3986013986014
0
0
39
27.2727272727273
143
RepliedTo
RepliedTo
webscraping
webscraping
work looking feedback easy build tool help literally thanks engine
work looking tool help literally thanks engine good send way
appreciate,feedback don,much min,work use,pay mod,use based,trust outrageous,totally programming,background hoping,build scratch,easy
appreciate,feedback don,much min,work use,pay mod,use based,trust outrageous,totally programming,background hoping,build scratch,easy
100
https://styles.redditmedia.com/t5_2er8sb/styles/profileIcon_snoo2b205478-3dd1-473b-8500-f36fda43080b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=484ce842c7eb5e74a7cc1ee236ba31f37641cdee
miserable_author
miserable_author
1
503.528503417969
6288.90087890625
1
1
0
0
0
0.002439
0
0
157
Miserable_Author
2/6/2020 11:40:16 PM
0
7
107
0
False
False
False
False
True
False
t2_20p4r36c
False
False
False
https://styles.redditmedia.com/t5_2er8sb/styles/profileIcon_snoo2b205478-3dd1-473b-8500-f36fda43080b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=484ce842c7eb5e74a7cc1ee236ba31f37641cdee
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Miserable_Author
1
1
12.5
0
0
0
0
3
37.5
8
RepliedTo
RepliedTo
webscraping
webscraping
maybe help well graphdb
maybe help well graphdb
graphdb,help help,well maybe,graphdb
graphdb,help help,well maybe,graphdb
142.857142857143
https://styles.redditmedia.com/t5_17swwk/styles/profileIcon_snoo1dfb1db1-a605-4873-a772-bfe5a23e3f1e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=878be289915ae7abcb9648fc05d4e350b2e27fe2
mostlybeak
mostlybeak
76.3302637401968
6832.99609375
2314.3251953125
4
3
6
0.007335
0
0.003179
0
0.666666666666667
158
mostlybeak
2/20/2017 11:43:39 PM
0
5
213
0
False
False
False
False
True
False
t2_15lcq0
False
False
False
https://styles.redditmedia.com/t5_17swwk/styles/profileIcon_snoo1dfb1db1-a605-4873-a772-bfe5a23e3f1e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=878be289915ae7abcb9648fc05d4e350b2e27fe2
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/mostlybeak
32
4
4.81927710843374
2
2.40963855421687
0
0
30
36.144578313253
83
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
someone octoparse help bad realize way looking don necessarily being
someone help bad realize way looking don necessarily being thing
necessarily,trying don,see thank,necessarily easy,way thing,ask someone,walk help,especially looking,hire realize,learn octoparse,easy
necessarily,trying don,see thank,necessarily easy,way thing,ask someone,walk help,especially looking,hire realize,learn octoparse,easy
100
https://styles.redditmedia.com/t5_5m59c3/styles/profileIcon_919w3qn6gtxa1.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5c01251f66dd5547253c01167589e68bd2a7cfc2
noah_bd
noah_bd
1
6524.12109375
2677.92431640625
1
1
0
0.004401
0
0.002192
0
1
159
noah_bd
1/4/2022 1:24:18 PM
0
1
3
0
False
False
False
False
True
False
t2_hham89vk
False
False
False
https://styles.redditmedia.com/t5_5m59c3/styles/profileIcon_919w3qn6gtxa1.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5c01251f66dd5547253c01167589e68bd2a7cfc2
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/noah_bd
32
1
2.7027027027027
0
0
0
0
19
51.3513513513514
37
RepliedTo
RepliedTo
webscraping
webscraping
zone scraping discovery web brightdata content tutorials tools offer octoparse
zone scraping discovery web brightdata content tutorials tools offer octoparse
web,scraping brightdata,discovery discovery,zone octoparse,web offer,full tools,offer full,tutorials amazing,content interested,learning tutorials,amazing
web,scraping brightdata,discovery discovery,zone octoparse,web offer,full tools,offer full,tutorials amazing,content interested,learning tutorials,amazing
100
https://styles.redditmedia.com/t5_1bdh1d/styles/profileIcon_snooebecf5b8-f418-475a-90c5-6cc27ef98f1e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=aaa27b12b0bab5af3975b615fc481e3657097f8e
omnipotentsoul
omnipotentsoul
1
6348.40966796875
1673.62182617188
0
1
0
0.004401
0
0.002192
0
0
160
omnipotentsoul
9/2/2016 5:46:07 AM
0
6
9
0
False
False
False
False
True
False
t2_113zaf
False
False
False
https://styles.redditmedia.com/t5_1bdh1d/styles/profileIcon_snooebecf5b8-f418-475a-90c5-6cc27ef98f1e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=aaa27b12b0bab5af3975b615fc481e3657097f8e
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/omnipotentsoul
32
0
0
1
4.34782608695652
0
0
13
56.5217391304348
23
RepliedTo
RepliedTo
webscraping
webscraping
technical scraping issue web non taught already classes people before
technical scraping issue web non taught already classes people before
non,technical solved,workflow technical,solved taught,several before,people web,scraping technical,non scraping,classes people,technical several,web
non,technical solved,workflow technical,solved taught,several before,people web,scraping technical,non scraping,classes people,technical several,web
100
https://styles.redditmedia.com/t5_5lhpui/styles/profileIcon_snoo29b1034c-677c-4f34-bf5e-0c549a9f178d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=aa981ba66a4fa3638bed9c3a8c2042f06ddc8658
just-sum-dude69
just-sum-dude69
1
6293.470703125
3197.68579101563
1
1
0
0.004401
0
0.002192
0
1
161
just-sum-dude69
1/1/2022 4:51:44 PM
0
7427
23552
0
False
False
False
False
True
False
t2_i55xsgce
False
False
True
https://styles.redditmedia.com/t5_5lhpui/styles/profileIcon_snoo29b1034c-677c-4f34-bf5e-0c549a9f178d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=aa981ba66a4fa3638bed9c3a8c2042f06ddc8658
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/just-sum-dude69
32
1
2.38095238095238
0
0
0
0
18
42.8571428571429
42
Commented
Commented
webscraping
webscraping
tutor rely scraping more need space teach based work being
tutor rely scraping more need space teach based work being
work,research rely,tutor much,more require,work being,tutored teach,rely activities,require space,scraping based,activities programming,based
work,research rely,tutor much,more require,work being,tutored teach,rely activities,require space,scraping based,activities programming,based
100
https://styles.redditmedia.com/t5_78y69d/styles/profileIcon_snoof745f2ca-3057-43b2-b08c-2c08b122ff2a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=fffec5c6566f163d80247139c15682fb8c84f1d3
inventeurduzdong
inventeurduzdong
1
1767.21984863281
6950.515625
1
1
0
0
0
0.002439
0
0
162
Inventeurduzdong
10/23/2022 7:48:34 AM
0
1
0
0
False
False
False
False
True
False
t2_8fk2xr2b
False
False
False
https://styles.redditmedia.com/t5_78y69d/styles/profileIcon_snoof745f2ca-3057-43b2-b08c-2c08b122ff2a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=fffec5c6566f163d80247139c15682fb8c84f1d3
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Inventeurduzdong
1
2
3.27868852459016
1
1.63934426229508
0
0
29
47.5409836065574
61
Posted
Posted
webscraping
webscraping
continue scraped first result hello problem scraping next octoparse works
continue scraped first result hello problem scraping next octoparse works
etc,scraped knows,problem extract,duplicated continue,scraping pagination,never create,loop scraped,first continue,extract octoparse,click loop,pagination
etc,scraped knows,problem extract,duplicated continue,scraping pagination,never create,loop scraped,first continue,extract octoparse,click loop,pagination
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
reddit-book-bot
reddit-book-bot
1
4969.54150390625
5016.65673828125
0
1
0
0.021032
0
0.002182
0
0
163
Reddit-Book-Bot
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/Reddit-Book-Bot
12
1
2.27272727272727
0
0
0
0
29
65.9090909090909
44
RepliedTo
RepliedTo
learnpython
learnpython
reddit bot 1984 book user books good old ### more
reddit bot 1984 book user books good old ### more
reddit,user user,reddit reddit,book book,bot bot,more 1984,snewd orwell,good george,orwell copy,### snewd,ebooks
reddit,user user,reddit reddit,book book,bot bot,more 1984,snewd orwell,good george,orwell copy,### snewd,ebooks
685.714285714286
https://styles.redditmedia.com/t5_1yz875/styles/profileIcon_klqlly9fc4l41.png?width=256&height=256&crop=256:256,smart&v=enabled&s=58ea3deb42809d81f0946d141d4d6e8a2dbf9524
automoderator
automoderator
1030.51360444936
4069.56274414063
672.800109863281
0
7
82
0.017464
0
0.003194
0
0
165
AutoModerator
1/5/2012 5:24:28 AM
0
1000
1000
0
False
False
False
False
True
False
t2_6l4z3
False
True
True
https://styles.redditmedia.com/t5_1yz875/styles/profileIcon_klqlly9fc4l41.png?width=256&height=256&crop=256:256,smart&v=enabled&s=58ea3deb42809d81f0946d141d4d6e8a2dbf9524
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/AutoModerator
9
19
2.31143552311436
15
1.82481751824818
0
0
375
45.6204379562044
822
Commented
Commented
excel googlesheets PiratedGames content_marketing startups
PiratedGames excel googlesheets content_marketing startups
reddit wiki piratedgames excel questions please automatically comments bot moderators
piratedgames excel reddit comments part body googlesheets question read include
bot,action moderators,subreddit reddit,piratedgames automatically,please contact,moderators message,compose action,performed please,contact questions,concerns subreddit,message
reddit,piratedgames piratedgames,comments excel,wiki part,reddit excel,comments example,excel guide,reddit read,whole videogame,piracy sure,read
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
emergency-nose-2980
emergency-nose-2980
1
3960.80126953125
71.2179489135742
2
1
0
0.01063
0
0.002315
0
0
166
Emergency-Nose-2980
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/Emergency-Nose-2980
9
0
0
0
0
0
0
1
100
1
Posted
Posted
PiratedGames
PiratedGames
removed
removed
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
theopinionexpert
theopinionexpert
1
1451.29711914063
6288.90087890625
1
1
0
0
0
0.002439
0
0
167
theopinionexpert
6/28/2021 2:07:41 PM
0
995
5790
0
False
False
False
False
True
False
t2_cp947oj5
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/theopinionexpert
1
1
5.26315789473684
0
0
0
0
7
36.8421052631579
19
Posted
Posted
webscraping
webscraping
octoparse experts pointer use around quick very appreciate
octoparse experts pointer use around quick very appreciate
appreciate,quick use,octoparse experts,around octoparse,very very,experts quick,pointer around,appreciate
appreciate,quick use,octoparse experts,around octoparse,very very,experts quick,pointer around,appreciate
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
lost_buy_8044
lost_buy_8044
1
1767.21984863281
6288.90087890625
1
1
0
0
0
0.002439
0
0
168
Lost_Buy_8044
7/16/2022 6:49:01 AM
0
1
0
0
False
False
False
False
True
False
t2_q2trkh4z
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Lost_Buy_8044
1
Posted
Posted
windows
windows
100
https://styles.redditmedia.com/t5_eb3pw/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8zODc4OA_rare_ecb34f9a-c3dc-49ad-b013-e2e365cfd584-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=495a20dfc09c52765ad1f9a1d6e7ef38500b42c1
mikeyandwind
mikeyandwind
1
8159.57568359375
5334.22412109375
0
1
0
0.002445
0
0.002269
0
0
169
Mikeyandwind
9/23/2015 10:04:35 PM
0
214
930
0
False
False
False
False
True
False
t2_qoz11
False
False
False
https://styles.redditmedia.com/t5_eb3pw/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8zODc4OA_rare_ecb34f9a-c3dc-49ad-b013-e2e365cfd584-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=495a20dfc09c52765ad1f9a1d6e7ef38500b42c1
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Mikeyandwind
64
0
0
1
9.09090909090909
0
0
4
36.3636363636364
11
Commented
Commented
webscraping
webscraping
same having solution problem find
same having solution problem find
having,same find,solution problem,find same,problem
having,same find,solution problem,find same,problem
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
representative_art71
representative_art71
1
8159.57568359375
4807.21142578125
2
1
0
0.002445
0
0.002609
0
0
170
Representative_Art71
6/23/2020 6:20:20 PM
0
5
0
0
False
False
False
False
True
False
t2_587wtdfi
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Representative_Art71
64
3
1.31578947368421
4
1.75438596491228
0
0
132
57.8947368421053
228
Posted
Posted
webscraping
webscraping
scroll octoparse infinite data 150 search scrape gov articles 6470993
scroll octoparse infinite data 150 search scrape gov articles 6470993
infinite,scroll lines,data 6470993,dealing search,dca helpcenter,octoparse dca,ca ca,gov articles,6470993 dealing,pagination octoparse,octoparse
infinite,scroll lines,data 6470993,dealing search,dca helpcenter,octoparse dca,ca ca,gov articles,6470993 dealing,pagination octoparse,octoparse
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
kaligule
kaligule
1
2473.57080078125
7510.5546875
0
1
0
0.023961
0.153925
0.002133
0
0
171
Kaligule
8/25/2012 2:46:30 PM
0
4071
24636
0
False
False
False
False
True
False
t2_8s7hc
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Kaligule
3
0
0
0
0
0
0
4
33.3333333333333
12
Commented
Commented
u_Octoparseideas
u_Octoparseideas
commandline single same wget
commandline single same wget
single,wget same,single wget,commandline
single,wget same,single wget,commandline
100
https://styles.redditmedia.com/t5_xoydu/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNmFjYjhmYjgyODgwZDM5YzJiODQ0NmY4Nzc4YTE0ZDM0ZWU2Y2ZiN18yMjI2MzQ_rare_334bfbfa-5d81-443a-bf6b-6fcfb196d763-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ac121fa47d3dd42987820b1bf711d61829c4d02d
bastimars
bastimars
1
5366.06640625
2143.66015625
0
1
0
0.006656
0
0.00226
0
0
172
bastimars
3/6/2019 5:36:45 PM
0
711
24222
0
False
False
False
False
True
False
t2_3cvk5mbv
False
False
False
https://styles.redditmedia.com/t5_xoydu/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNmFjYjhmYjgyODgwZDM5YzJiODQ0NmY4Nzc4YTE0ZDM0ZWU2Y2ZiN18yMjI2MzQ_rare_334bfbfa-5d81-443a-bf6b-6fcfb196d763-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ac121fa47d3dd42987820b1bf711d61829c4d02d
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/bastimars
13
0
0
0
0
0
0
5
45.4545454545455
11
RepliedTo
RepliedTo
france
france
tk hardware outils cao tcl
tk hardware outils cao tcl
cao,hardware outils,cao tk,outils tcl,tk
cao,hardware outils,cao tk,outils tcl,tk
185.714285714286
https://styles.redditmedia.com/t5_fd0dt/styles/profileIcon_snoo17867d3d-01ed-4907-8198-be549a7db80f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=47b63be26cdd09bf7cd462d911daf370a1c56489
snykeurs
snykeurs
151.660527480394
5286.6298828125
1486.16345214844
1
1
12
0.009984
0
0.002497
0
0
173
Snykeurs
2/3/2018 10:58:56 AM
0
21374
57051
0
False
False
False
False
True
False
t2_v46pstl
False
False
True
https://styles.redditmedia.com/t5_fd0dt/styles/profileIcon_snoo17867d3d-01ed-4907-8198-be549a7db80f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=47b63be26cdd09bf7cd462d911daf370a1c56489
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Snykeurs
13
1
3.2258064516129
1
3.2258064516129
0
0
17
54.8387096774194
31
Commented
Commented
france
france
très langage compétences niche salaire ces rust peux connais car
très langage compétences niche salaire ces rust peux connais car
connais,langage genre,cobol recherché,hardware langage,niche compétences,sont salaire,car car,ces gros,salaire peux,négocier ces,compétences
connais,langage genre,cobol recherché,hardware langage,niche compétences,sont salaire,car car,ces gros,salaire peux,négocier ces,compétences
385.714285714286
https://styles.redditmedia.com/t5_4w0jks/styles/profileIcon_snoo67e73d56-01c4-45b5-b93e-5a72bb9948ea-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=072911cb77b2cb56892d91c717f82dddb76a88b1
nanami2977
nanami2977
503.201758267979
5201.2529296875
779.527954101563
7
2
40
0.014976
0
0.003964
0
0.166666666666667
174
nanami2977
8/12/2021 1:50:56 AM
0
9
0
0
False
False
False
False
True
False
t2_dv3teqjz
False
False
False
https://styles.redditmedia.com/t5_4w0jks/styles/profileIcon_snoo67e73d56-01c4-45b5-b93e-5a72bb9948ea-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=072911cb77b2cb56892d91c717f82dddb76a88b1
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/nanami2977
13
0
0
0
0
0
0
114
60.3174603174603
189
Posted RepliedTo
RepliedTo Posted
programmation france
france programmation
programmation france langages blog plus suis différence octoparse fais salaire
programmation france langages blog plus article c'est l'article exemple cet
langages,programmation programmation,plus mes,études france,octoparse travailler,sud partager,salaires exacts,fais octoparse,fr paris,mais emplois,programmeurs
langages,programmation programmation,plus l'article,langages ooop,j'ai traite,certaines cet,article article,langages ici,cet sud,toulouse merci,votre
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
guidule
guidule
1
5623.51123046875
836.4970703125
0
1
0
0.008557
0
0.002158
0
0
175
Guidule
2/22/2019 5:12:15 PM
0
51
153
0
False
False
False
False
True
False
t2_3a07wh8m
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/Guidule
13
0
0
0
0
0
0
37
63.7931034482759
58
Commented
Commented
france
france
emploi mieux 2020 langages plus très bidon article java développeur
emploi mieux 2020 langages plus très bidon article java développeur
langages,plus 2020,langages bidon,voila demandes,mieux payés,emploi actu,315699 vrai,étude mieux,payes espoir,annee payes,java
langages,plus 2020,langages bidon,voila demandes,mieux payés,emploi actu,315699 vrai,étude mieux,payes espoir,annee payes,java
100
https://styles.redditmedia.com/t5_4kbtr7/styles/profileIcon_snood3ab7543-1ca2-4e6b-9235-713f1989334e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d39342d0aa1d43b355a4177d4ddeb8d14c616596
live-cover4440
live-cover4440
1
5115.67529296875
71.2179489135742
0
1
0
0.008557
0
0.002158
0
0
176
Live-Cover4440
6/8/2021 12:06:08 PM
0
160
29044
0
False
False
False
False
True
False
t2_cllcql31
False
False
False
https://styles.redditmedia.com/t5_4kbtr7/styles/profileIcon_snood3ab7543-1ca2-4e6b-9235-713f1989334e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d39342d0aa1d43b355a4177d4ddeb8d14c616596
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Live-Cover4440
13
1
4.16666666666667
0
0
0
0
14
58.3333333333333
24
Commented
Commented
france
france
l'article monde c# grand gros goland java sql fait semble
l'article monde c# grand gros goland java sql fait semble
quand,même gros,doute monde,fait doute,l'article l'article,quand fait,sql grand,monde c#,ça sql,goland goland,avant
quand,même gros,doute monde,fait doute,l'article l'article,quand fait,sql grand,monde c#,ça sql,goland goland,avant
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
viagrabrain
viagrabrain
1
4808.294921875
483.463104248047
0
1
0
0.008557
0
0.002158
0
0
177
viagrabrain
3/9/2021 6:38:36 AM
0
15
680
0
False
False
False
False
True
False
t2_760p8pc3
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/viagrabrain
13
1
1.61290322580645
1
1.61290322580645
0
0
30
48.3870967741936
62
Commented
Commented
france
france
region mal depuis regarder dehors cherche ta utile d'opportunités combo
region depuis regarder dehors cherche ta utile d'opportunités combo pandemie
data,python territoire,meme autour,data full,remote super,combo tout,territoire aimes,problematiques combo,mal beauuucoup,plus region,parisienne
data,python territoire,meme autour,data full,remote super,combo tout,territoire aimes,problematiques combo,mal beauuucoup,plus region,parisienne
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
shinversus
shinversus
1
4827.29541015625
1122.32348632813
0
1
0
0.008557
0
0.002158
0
0
178
shinversus
9/11/2013 6:18:18 PM
0
14
5746
0
False
False
False
False
True
False
t2_d40uz
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/shinversus
13
2
1.21951219512195
2
1.21951219512195
0
0
93
56.7073170731707
164
Commented
Commented
france
france
langage c'est technique même important mais plus déjà tout salaires
langage c'est technique même important mais plus déjà tout salaires
même,langage juste,cherche parce,qu'ils valent,différents tous,bagage variées,conseille énormément,variation équipent,variées tant,junior junior,ta
même,langage juste,cherche parce,qu'ils valent,différents tous,bagage variées,conseille énormément,variation équipent,variées tant,junior junior,ta
100
https://styles.redditmedia.com/t5_5hl1it/styles/profileIcon_snoo5b956853-be63-4674-ab7c-6cb8c4330dc6-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=fcbe73e4d95acd5b041e27b244a57f67d085f1da
niraj06
niraj06
1
1632.13330078125
2927.82397460938
0
1
0
0.017701
0
0.002206
0
0
179
niraj06
12/14/2021 6:16:32 AM
0
28
7
0
False
False
False
False
True
False
t2_hjrts4l5
False
False
False
https://styles.redditmedia.com/t5_5hl1it/styles/profileIcon_snoo5b956853-be63-4674-ab7c-6cb8c4330dc6-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=fcbe73e4d95acd5b041e27b244a57f67d085f1da
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/niraj06
2
3
3.06122448979592
1
1.02040816326531
0
0
47
47.9591836734694
98
Commented
Commented
WebDeveloper
WebDeveloper
web io scratching main things crawl others monster dexi one
web io scratching main things crawl others monster dexi one
web,scratching main,web others,graphical investigate,main abilities,future expressed,exceptionally monster,mozenda data,scraping based,devices require,coding
web,scratching main,web others,graphical investigate,main abilities,future expressed,exceptionally monster,mozenda data,scraping based,devices require,coding
657.142857142857
https://styles.redditmedia.com/t5_5hkq88/styles/profileIcon_snoo6734487b-8397-454f-b623-d168a5176d7b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=277d27ab1260bd01799297fa1553b3a5fad6f007
embarrassed_law_253
embarrassed_law_253
980.293428622559
1535.15197753906
2419.1142578125
3
1
78
0.021494
0
0.002657
0
0
180
Embarrassed_Law_253
12/14/2021 5:03:20 AM
0
1215
16
0
False
False
False
False
True
False
t2_eep16llf
False
False
True
https://styles.redditmedia.com/t5_5hkq88/styles/profileIcon_snoo6734487b-8397-454f-b623-d168a5176d7b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=277d27ab1260bd01799297fa1553b3a5fad6f007
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Embarrassed_Law_253
2
12
3.16622691292876
5
1.31926121372032
0
0
214
56.4643799472296
379
Posted
Posted
WebDeveloper
WebDeveloper
scraping web api scrape scraper octoparse extract proxy cloud scraperapi
scraping web api scrape scraper octoparse extract proxy cloud scraperapi
web,scraping scraping,api web,scraper scrape,cloud html,website scraper,tool scrape,data online,scraping forms,enter one,greatest
web,scraping scraping,api web,scraper scrape,cloud html,website scraper,tool scrape,data online,scraping forms,enter one,greatest
100
https://styles.redditmedia.com/t5_bfil0/styles/profileIcon_snooa899803e-0b7c-42e0-be8c-005a327719f3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=33018c74939a5b191cb8a89a287b5678d636cf7b
pupniko
pupniko
1
3705.529296875
7734.2060546875
0
1
0
0.013649
0.007261
0.002267
0
0
181
Pupniko
3/9/2016 1:03:58 PM
0
230
42765
0
False
False
False
False
True
False
t2_wa69x
False
False
False
https://styles.redditmedia.com/t5_bfil0/styles/profileIcon_snooa899803e-0b7c-42e0-be8c-005a327719f3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=33018c74939a5b191cb8a89a287b5678d636cf7b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Pupniko
3
2
2.8169014084507
2
2.8169014084507
0
0
24
33.8028169014084
71
RepliedTo
RepliedTo
content_marketing
content_marketing
content exactly think trust writers fields past enough pay bad
content exactly think trust writers fields past enough pay bad
trust,past expertise,fields content,written issues,finding fields,commissioning taking,long journalists,expertise use,experienced writers,use finding,writers
trust,past expertise,fields content,written issues,finding fields,commissioning taking,long journalists,expertise use,experienced writers,use finding,writers
385.714285714286
https://styles.redditmedia.com/t5_u2tl5/styles/profileIcon_snoobaafeee2-6a4d-4a24-bb76-85631e2e2a20-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4b93ec0ea37c08b4095a7062de58dfaff8bfba9a
spaceforceawakens
spaceforceawakens
503.201758267979
3401.22924804688
7769.7705078125
2
1
40
0.018275
0.034973
0.002588
0
0.5
182
SpaceForceAwakens
1/5/2019 4:42:42 AM
0
23009
198708
0
False
False
False
False
True
False
t2_2xeoqvj3
False
True
True
https://styles.redditmedia.com/t5_u2tl5/styles/profileIcon_snoobaafeee2-6a4d-4a24-bb76-85631e2e2a20-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4b93ec0ea37c08b4095a7062de58dfaff8bfba9a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/SpaceForceAwakens
3
12
5.76923076923077
4
1.92307692307692
0
0
70
33.6538461538462
208
RepliedTo
RepliedTo
content_marketing
content_marketing
pay writing people good knowledge things start those creative such
pay writing people good knowledge things start those creative such
rates,same writing,creative writers,sites both,very know,knowledge same,problem sites,fiverr things,writing knowledge,problem find,right
rates,same writing,creative writers,sites both,very know,knowledge same,problem sites,fiverr things,writing knowledge,problem find,right
642.857142857143
https://styles.redditmedia.com/t5_c6boy/styles/profileIcon_snoo4a3bf936-7e50-420f-bcd1-2b0b9381a222-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e635409c85626f5c995facc83399cafb2fe2b92f
divajanelle
divajanelle
955.18334070916
3071.0732421875
7787.6376953125
1
2
76
0.026299
0.161186
0.002327
0
0.5
183
DivaJanelle
3/4/2014 10:21:40 PM
0
415
15740
0
False
False
False
False
True
False
t2_fk0k8
False
False
False
https://styles.redditmedia.com/t5_c6boy/styles/profileIcon_snoo4a3bf936-7e50-420f-bcd1-2b0b9381a222-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e635409c85626f5c995facc83399cafb2fe2b92f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/DivaJanelle
3
2
1.28205128205128
1
0.641025641025641
0
0
70
44.8717948717949
156
Commented RepliedTo
RepliedTo Commented
content_marketing
content_marketing
journalists content words 1200 several current back writing word marketing
words 1200 several current back writing word marketing published etc
hoping,direction abso,ing people,suggest specific,amount journalists,laid paying,piece base,knowledge 1200,words based,need amount,articles
hoping,direction abso,ing people,suggest specific,amount journalists,laid paying,piece base,knowledge 1200,words based,need amount,articles
100
https://styles.redditmedia.com/t5_2530lf/styles/profileIcon_d5saokl48xa81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=abfdc19fa823b4bf74b65f4bf65518c99569eeda
writerdiana
writerdiana
1
2925.30151367188
8479.5517578125
0
1
0
0.023961
0.153925
0.002133
0
0
184
writerDiana
9/18/2019 6:09:44 PM
0
14
70
0
False
False
False
False
True
False
t2_4m6a902c
False
False
True
https://styles.redditmedia.com/t5_2530lf/styles/profileIcon_d5saokl48xa81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=abfdc19fa823b4bf74b65f4bf65518c99569eeda
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/writerDiana
3
5
2.82485875706215
4
2.25988700564972
0
0
64
36.1581920903955
177
Commented
Commented
content_marketing
content_marketing
pay writers quality good time find writer work long days
pay writers quality good time find writer work long days
tighten,belt short,important people,days vibe,lot important,employer term,collaborations writer,pay overwhelmed,work long,short complete,project
tighten,belt short,important people,days vibe,lot important,employer term,collaborations writer,pay overwhelmed,work long,short complete,project
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
kingcapital-
kingcapital-
1
2578.99267578125
6617.9609375
0
1
0
0.023961
0.153925
0.002133
0
0
185
KingCapital-
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/KingCapital-
3
0
0
0
0
0
0
6
35.2941176470588
17
Commented
Commented
content_marketing
content_marketing
linkedin writers indian spammed week put
linkedin writers indian spammed week put
spammed,indian linkedin,spammed put,linkedin writers,week indian,writers
spammed,indian linkedin,spammed put,linkedin writers,week indian,writers
100
https://styles.redditmedia.com/t5_3i6xsk/styles/profileIcon_snoo872f4855-3c4a-4345-a8f8-c767f0119e32-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a3525a9914b64d4a82d0202718d56e4d6db6eb0b
dravodin
dravodin
1
2337.9775390625
8375.8896484375
0
1
0
0.023961
0.153925
0.002133
0
0
186
Dravodin
12/3/2020 7:37:05 AM
0
79
744
0
False
False
False
False
True
False
t2_92vlosco
False
False
False
https://styles.redditmedia.com/t5_3i6xsk/styles/profileIcon_snoo872f4855-3c4a-4345-a8f8-c767f0119e32-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a3525a9914b64d4a82d0202718d56e4d6db6eb0b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Dravodin
3
0
0
0
0
0
0
3
50
6
Commented
Commented
content_marketing
content_marketing
dm samples sharing
dm samples sharing
sharing,samples samples,dm
sharing,samples samples,dm
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
successmysterious887
successmysterious887
1
2715.16772460938
7145.31494140625
0
1
0
0.023961
0.153925
0.002133
0
0
187
SuccessMysterious887
1/22/2021 4:33:28 PM
0
7
0
0
False
False
False
False
True
False
t2_9xu62vzs
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/SuccessMysterious887
3
1
2.17391304347826
1
2.17391304347826
0
0
18
39.1304347826087
46
Commented
Commented
content_marketing
content_marketing
hopefully late friend's company proposition send end really well shortly
hopefully late friend's company proposition send end really well shortly
late,known think,well shortly,hopefully company,exactly hopefully,end exactly,looking send,over bit,late looking,hopefully friend's,company
late,known think,well shortly,hopefully company,exactly hopefully,end exactly,looking send,over bit,late looking,hopefully friend's,company
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
samantha-diane
samantha-diane
1
2740.20971679688
6587.66015625
0
1
0
0.023961
0.153925
0.002133
0
0
188
Samantha-diane
1/8/2021 3:34:14 AM
0
1
5
0
False
False
False
False
True
False
t2_9psgt6os
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Samantha-diane
3
4
10.8108108108108
0
0
0
0
15
40.5405405405405
37
Commented
Commented
content_marketing
content_marketing
good writer suited helps average writers work very writersaccess science
good writer suited helps average writers work very writersaccess science
concierge,helps good,writers science,tech writersaccess,very helps,find work,science good,resource deliver,need resource,good writer,deliver
concierge,helps good,writers science,tech writersaccess,very helps,find work,science good,resource deliver,need resource,good writer,deliver
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
adrianhorning
adrianhorning
1
6384.0302734375
8396.595703125
0
1
0
0.005589
0
0.002178
0
0
189
adrianhorning
2/5/2017 9:23:14 PM
0
17
205
0
False
False
False
False
True
False
t2_153z81
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/adrianhorning
19
2
1.96078431372549
2
1.96078431372549
0
0
36
35.2941176470588
102
Commented
Commented
webscraping
webscraping
lambda use someone functions minutes changes 20 think thing told
lambda use someone functions minutes changes 20 think thing told
lambda,functions address,looks started,think quick,way went,lambda someone,work use,fails spin,instances way,started someone,told
lambda,functions address,looks started,think quick,way went,lambda someone,work use,fails spin,instances way,started someone,told
185.714285714286
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
liberalexpenditures
liberalexpenditures
151.660527480394
6563.2333984375
7431.5947265625
5
1
12
0.00978
0
0.003484
0
0
190
LiberalExpenditures
5/12/2014 12:43:11 AM
0
258
6649
0
False
False
False
False
True
False
t2_gixh7
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/LiberalExpenditures
19
5
2.63157894736842
14
7.36842105263158
0
0
80
42.1052631578947
190
Posted
Posted
webscraping
webscraping
port delay still octoparse data error enter ip unfortunately between
port delay still octoparse data error enter ip unfortunately between
delay,between fill,gaps hard,lesson mention,anything general,guidance protonvpn,unfortunately provides,option downloaded,protonvpn haven't,quite website,still
delay,between fill,gaps hard,lesson mention,anything general,guidance protonvpn,unfortunately provides,option downloaded,protonvpn haven't,quite website,still
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
ristoriel
ristoriel
1
6293.470703125
6790.544921875
0
1
0
0.005589
0
0.002178
0
0
191
ristoriel
12/10/2020 3:04:13 AM
0
2
1
0
False
False
False
False
True
False
t2_992lkk8n
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ristoriel
19
5
5.05050505050505
2
2.02020202020202
0
0
45
45.4545454545455
99
Commented
Commented
webscraping
webscraping
ip many proxies well requests time threats intervals used rotate
ip many proxies well requests time threats intervals used rotate
webscraping,ips well,calculate definite,intervals many,requests reduce,possibilities proxies,webscraping passive,agressive send,requests datacenter,proxies chances,getting
webscraping,ips well,calculate definite,intervals many,requests reduce,possibilities proxies,webscraping passive,agressive send,requests datacenter,proxies chances,getting
100
https://styles.redditmedia.com/t5_52nwl3/styles/profileIcon_snooaee91ee2-3d44-4812-a6ea-81a61efc8f0a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=344ac18abda7396b3499cb57e4da43bad167c594
humorminimum1707
humorminimum1707
1
6832.99609375
8072.6416015625
0
1
0
0.005589
0
0.002178
0
0
192
HumorMinimum1707
9/22/2021 5:30:46 AM
0
3853
769
0
False
False
False
False
True
False
t2_ep4gqfku
False
False
False
https://styles.redditmedia.com/t5_52nwl3/styles/profileIcon_snooaee91ee2-3d44-4812-a6ea-81a61efc8f0a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=344ac18abda7396b3499cb57e4da43bad167c594
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/HumorMinimum1707
19
0
0
0
0
0
0
37
51.3888888888889
72
Commented
Commented
webscraping
webscraping
octoparse proxy rotating tutorial proxies# set need thos brightdata settings
octoparse proxy rotating tutorial proxies# set need thos brightdata settings
set,proxies# tutorial,set octoparse,tutorial proxies#,first need,subscribe ip,rotation solutions,rotating guide,octoparse proxy,server's brightdata,solutions
set,proxies# tutorial,set octoparse,tutorial proxies#,first need,subscribe ip,rotation solutions,rotating guide,octoparse proxy,server's brightdata,solutions
100
https://styles.redditmedia.com/t5_750ui0/styles/profileIcon_snood6b6ec18-3861-458e-b072-0a735dbec425-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b246b0f082cbf250c170e05fef474b9cccd5eef5
amandakamen
amandakamen
1
6742.4345703125
6466.58984375
0
1
0
0.005589
0
0.002178
0
0
193
AmandaKamen
10/3/2022 12:07:39 PM
0
1
7
0
False
False
False
False
True
False
t2_t16pkdzh
False
False
False
https://styles.redditmedia.com/t5_750ui0/styles/profileIcon_snood6b6ec18-3861-458e-b072-0a735dbec425-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b246b0f082cbf250c170e05fef474b9cccd5eef5
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/AmandaKamen
19
0
0
0
0
0
0
14
46.6666666666667
30
Commented
Commented
webscraping
webscraping
utm_source utm_medium reddit ips answer proxy rotate residential use tutorial
utm_source utm_medium reddit ips answer proxy rotate residential use tutorial
tutorial,done soax,utm_source reddit,utm_content answer,ips proxy,soax utm_content,answer utm_medium,reddit ips,rotate use,residential social,utm_medium
tutorial,done soax,utm_source reddit,utm_content answer,ips proxy,soax utm_content,answer utm_medium,reddit ips,rotate use,residential social,utm_medium
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
autistic_alpha
autistic_alpha
1
3051.68872070313
3496.18310546875
0
1
0
0.016436
0
0.00217
0
0
194
autistic_alpha
1/2/2020 1:50:23 PM
0
92
218
0
False
False
False
False
True
False
t2_5cr3wiq3
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/autistic_alpha
5
1
11.1111111111111
0
0
0
0
3
33.3333333333333
9
Commented
Commented
webscraping
webscraping
taco favor cerveza bueno
taco favor cerveza bueno
bueno,cerveza cerveza,favor favor,taco
bueno,cerveza cerveza,favor favor,taco
1000
https://styles.redditmedia.com/t5_2cei57/styles/profileIcon_35hoq558s5w51.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0f2ac10f0949c1e4b88c8a8a795257071e5dd59a
melisaxinyue
melisaxinyue
2336.2381759461
2917.19799804688
4085.71875
5
2
186
0.020481
2E-06
0.003231
0
0.25
195
melisaxinyue
1/9/2020 8:12:04 AM
0
1
1
0
False
False
False
False
True
False
t2_579soaru
False
False
False
https://styles.redditmedia.com/t5_2cei57/styles/profileIcon_35hoq558s5w51.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0f2ac10f0949c1e4b88c8a8a795257071e5dd59a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/melisaxinyue
5
292
0.466066526208262
108
0.172380769967439
0
0
34860
55.6406818617123
62652
Posted Commented RepliedTo
Commented RepliedTo Posted
u_melisaxinyue webscraping CoronavirusRecession bigdata api hedgefund analyzit visualization
webscraping u_melisaxinyue bigdata CoronavirusRecession hedgefund api visualization analyzit
datos web octoparse scraping extraer sitios blog información cómo contenido
captcha datos web zapier contenido #x200b precios curso scraping rgpd
web,scraping sitios,web octoparse,blog sitio,web extraer,datos big,data extracción,datos datos,web raspado,web página,web
web,scraping big,data datos,web datos,financieros sitios,web seguimiento,precios import,io agregación,contenido extracción,datos análisis,datos
114.285714285714
https://styles.redditmedia.com/t5_749ppz/styles/profileIcon_snoocd1679c4-818c-4b45-b157-c76f7cd8a14f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=813ead9e3c03a67a4135261833846c2aa7b7a8a7
intelligent-age-3129
intelligent-age-3129
26.1100879133989
7162.5029296875
584.061096191406
3
3
2
0.00489
0
0.002882
0
1
196
Intelligent-Age-3129
9/29/2022 6:55:08 PM
0
31
483
0
False
False
False
False
True
False
t2_s3fsxk3q
False
False
False
https://styles.redditmedia.com/t5_749ppz/styles/profileIcon_snoocd1679c4-818c-4b45-b157-c76f7cd8a14f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=813ead9e3c03a67a4135261833846c2aa7b7a8a7
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Intelligent-Age-3129
44
9
2.84810126582278
1
0.316455696202532
0
0
139
43.9873417721519
316
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
data competition softr file io heat list millennium2022 before friend
competition softr heat list millennium2022 file io friend dance htm
softr,io heat,list mngr,millennium2022 io,softr 1x,year before,competition comp,mngr free,competition solution,friend list,take
heat,list softr,io mngr,millennium2022 before,competition comp,mngr io,softr 1x,year free,competition solution,friend list,take
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
accomplished-gap-748
accomplished-gap-748
1
7452.560546875
71.2179489135742
1
1
0
0.00326
0
0.002217
0
1
197
Accomplished-Gap-748
4/29/2021 3:56:22 PM
0
134
116
0
False
False
False
False
True
False
t2_b906iol6
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Accomplished-Gap-748
44
1
2.27272727272727
2
4.54545454545455
0
0
16
36.3636363636364
44
Commented RepliedTo
Commented RepliedTo
webscraping
webscraping
python data completely frequently updated yes impossible lines scrap easy
python data completely frequently updated yes impossible lines scrap easy
completely,impossible few,lines competition,data easy,scrap lines,python yes,extract seems,easy python,completely frequently,updated impossible,use
completely,impossible few,lines competition,data easy,scrap lines,python yes,extract seems,easy python,completely frequently,updated impossible,use
100
https://styles.redditmedia.com/t5_acyqt/styles/profileIcon_snooe9d55e59-45c7-4f82-808f-01d15b3da887-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3d5d048c19c5e0569e0da88d08e1c3626928923a
goblin80
goblin80
1
6862.6396484375
1096.75646972656
1
1
0
0.00326
0
0.002217
0
1
198
Goblin80
8/21/2011 4:19:53 PM
0
268
443
0
False
False
False
False
True
False
t2_5pejv
False
False
False
https://styles.redditmedia.com/t5_acyqt/styles/profileIcon_snooe9d55e59-45c7-4f82-808f-01d15b3da887-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3d5d048c19c5e0569e0da88d08e1c3626928923a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Goblin80
44
0
0
0
0
0
0
209
68.976897689769
303
RepliedTo Commented
Commented RepliedTo
webscraping
webscraping
' x map queryselectorall textcontent div table join flatmap csv
' x map queryselectorall textcontent div table join flatmap url
join,' x,x map,x ',br ',' x,join textcontent,trim flat,function br,' body,innerhtml
join,' x,x map,x ',br ',' x,join textcontent,trim flat,function br,' appears,site's
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
ajah-wawan
ajah-wawan
1
2327.51879882813
7094.93212890625
0
1
0
0.023961
0.153925
0.002133
0
0
199
ajah-wawan
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/ajah-wawan
3
15
5.26315789473684
0
0
0
0
153
53.6842105263158
285
Commented
Commented
u_Octoparseideas
u_Octoparseideas
website tool email low analytics research marketing traffic social details
website tool email low analytics research marketing traffic social details
traffic,website email,marketing google,analytics allows,schedule insights,low low,cost high,open suggestions,add price,global conversions,adplify
traffic,website email,marketing google,analytics allows,schedule insights,low low,cost high,open suggestions,add price,global conversions,adplify
142.857142857143
https://styles.redditmedia.com/t5_bpvq2/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNjIyZDhmZWE0NjAzYmE5ZWRhZjEwODRiNDA3MDUyZDhiMGE5YmVkN18yOTU4MTk5_rare_d95e4bc9-8f80-4be5-a65e-b6d3b91f79c3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6b0d46f4c0489215197ffc629f89911226abfcf3
jeenyusjane
jeenyusjane
76.3302637401968
8431.2109375
7524.56689453125
4
3
6
0.007335
0
0.003179
0
0.666666666666667
200
JeenyusJane
1/3/2010 5:12:43 AM
0
3211
8037
0
False
False
False
False
True
False
t2_3syx8
False
False
True
https://styles.redditmedia.com/t5_bpvq2/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNjIyZDhmZWE0NjAzYmE5ZWRhZjEwODRiNDA3MDUyZDhiMGE5YmVkN18yOTU4MTk5_rare_d95e4bc9-8f80-4be5-a65e-b6d3b91f79c3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6b0d46f4c0489215197ffc629f89911226abfcf3
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/JeenyusJane
31
11
2.75
1
0.25
0
0
194
48.5
400
RepliedTo Posted
Posted RepliedTo
whatcarshouldIbuy
whatcarshouldIbuy
car cars use price airtable research make api listings mmy
car price airtable research make listings mmy explore model see
make,model explore,true airtable,universe expxibkgrqb4gtli4,car universe,expxibkgrqb4gtli4 car,buyers marketcheck's,api cars,tab price,miles research,explore
make,model explore,true airtable,universe expxibkgrqb4gtli4,car universe,expxibkgrqb4gtli4 car,buyers marketcheck's,api cars,tab price,miles research,explore
100
https://styles.redditmedia.com/t5_1rek2q/styles/profileIcon_snoo3c9e2eed-6fe2-4e16-a0e9-6887ae5fa130-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1e17533613d0755e0b25b94bd10d0926a22be46c
theguyonabike
theguyonabike
1
8715.404296875
7022.08984375
1
1
0
0.004401
0
0.002192
0
1
201
theguyonabike
3/21/2014 10:29:39 PM
0
1636
4617
0
False
False
False
False
True
False
t2_fspwp
False
False
False
https://styles.redditmedia.com/t5_1rek2q/styles/profileIcon_snoo3c9e2eed-6fe2-4e16-a0e9-6887ae5fa130-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1e17533613d0755e0b25b94bd10d0926a22be46c
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/theguyonabike
31
2
10.5263157894737
0
0
0
0
9
47.3684210526316
19
Commented
Commented
whatcarshouldIbuy
whatcarshouldIbuy
insurance cool cross helpful getting something quotes very various models
insurance cool cross helpful getting something quotes very various models
helpful,getting various,models quotes,cross shopping,various insurance,quotes cross,shopping something,helpful very,cool cool,something getting,insurance
helpful,getting various,models quotes,cross shopping,various insurance,quotes cross,shopping something,helpful very,cool cool,something getting,insurance
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
1920sbusinessman
1920sbusinessman
1
8461.63671875
8396.595703125
0
1
0
0.004401
0
0.002192
0
0
202
1920sBusinessMan
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/1920sBusinessMan
31
1
11.1111111111111
0
0
0
0
2
22.2222222222222
9
Commented
Commented
whatcarshouldIbuy
whatcarshouldIbuy
cool later check
cool later check
cool,check check,later
cool,check check,later
100
https://styles.redditmedia.com/t5_ebofq/styles/profileIcon_snoo9fcef9b7-c527-49bc-bdd4-c31d37b0c8d6-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6ade0ca51a642ff8f374e931a25dc5f974a4db6b
collectmoments
collectmoments
1
8116.59130859375
7155.01416015625
1
1
0
0.004401
0
0.002192
0
1
203
collectmoments
7/4/2012 3:43:15 PM
0
96
1364
0
False
False
False
False
True
False
t2_8834m
False
False
True
https://styles.redditmedia.com/t5_ebofq/styles/profileIcon_snoo9fcef9b7-c527-49bc-bdd4-c31d37b0c8d6-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6ade0ca51a642ff8f374e931a25dc5f974a4db6b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/collectmoments
31
1
16.6666666666667
0
0
0
0
2
33.3333333333333
6
Commented
Commented
whatcarshouldIbuy
whatcarshouldIbuy
paid api right
paid api right
paid,api api,right
paid,api api,right
100
https://styles.redditmedia.com/t5_jqil1/styles/profileIcon_snooa39b1e74-ab65-49c9-bf0e-36e3f2e73a34-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b94266a25cf0bd484069130c0df409d95ab976d2
irrelevant-opinion
irrelevant-opinion
1
4399.71435546875
9600.3291015625
0
1
0
0.016538
8E-06
0.00216
0
0
204
Irrelevant-Opinion
5/28/2018 7:39:07 PM
0
1032
5585
0
False
False
False
False
True
False
t2_1gosh7qo
False
False
False
https://styles.redditmedia.com/t5_jqil1/styles/profileIcon_snooa39b1e74-ab65-49c9-bf0e-36e3f2e73a34-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b94266a25cf0bd484069130c0df409d95ab976d2
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Irrelevant-Opinion
6
1
6.25
0
0
0
0
9
56.25
16
Commented
Commented
webscraping
webscraping
elements javascript certain dynamic using website loaded code familiar octoparse
elements javascript certain dynamic using website loaded code familiar octoparse
javascript,code dynamic,certain familiar,octoparse website,dynamic using,javascript elements,loaded octoparse,website loaded,using certain,elements
javascript,code dynamic,certain familiar,octoparse website,dynamic using,javascript elements,loaded octoparse,website loaded,using certain,elements
1000
https://styles.redditmedia.com/t5_omd1l/styles/profileIcon_rtkhbyfpysf61.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=f3898ef4414ccb3c1ed8c4795486ce9642fe6d14
whoamithelaw
whoamithelaw
2989.10046169447
4233.5107421875
8736.94140625
5
3
238
0.02064
3.2E-05
0.002889
0.0833333333333333
0.5
205
WhoAmITheLaw
9/16/2018 6:23:22 AM
0
1195
6181
0
False
False
False
False
True
False
t2_27zog3ld
False
False
True
https://styles.redditmedia.com/t5_omd1l/styles/profileIcon_rtkhbyfpysf61.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=f3898ef4414ccb3c1ed8c4795486ce9642fe6d14
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/WhoAmITheLaw
6
1
2.5
1
2.5
0
0
18
45
40
Posted RepliedTo
RepliedTo Posted
webscraping
webscraping
failed provide try pass software one yes know prefer later
failed provide try pass software one yes know prefer later
installed,day again,later provide,details know,prefer reminder,failed later,screen one,loading haven't,pass failed,try try,again
installed,day again,later provide,details know,prefer reminder,failed later,screen one,loading haven't,pass failed,try try,again
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
codeskunky
codeskunky
1
5347.52978515625
6979.35888671875
0
1
0
0.013726
0
0.002262
0
0
206
CodeSkunky
4/27/2019 11:13:10 AM
0
127
2303
0
False
False
False
False
True
False
t2_3o4t3cee
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/CodeSkunky
12
0
0
1
4.34782608695652
0
0
12
52.1739130434783
23
RepliedTo
RepliedTo
Python
Python
youtube retard accounts everybody going use never kbkeio full watch
youtube retard accounts everybody going use never kbkeio full watch
everybody,knows youtube,watch use,accounts accounts,everybody retard,youtube watch,oakg never,go going,use full,retard oakg,kbkeio
everybody,knows youtube,watch use,accounts accounts,everybody retard,youtube watch,oakg never,go going,use full,retard oakg,kbkeio
657.142857142857
https://styles.redditmedia.com/t5_cqsu7/styles/profileIcon_snoo1fc773ba-c279-4e64-b95b-ff57216eea98-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c88a3be5508341d445ba48c1be05bcf46c9b1d5f
cym13
cym13
980.293428622559
5343.72998046875
6448.09765625
1
1
78
0.015902
0
0.002517
0
0
207
cym13
1/14/2015 7:42:56 PM
0
6558
20307
0
False
False
False
False
True
False
t2_kps1w
False
False
False
https://styles.redditmedia.com/t5_cqsu7/styles/profileIcon_snoo1fc773ba-c279-4e64-b95b-ff57216eea98-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c88a3be5508341d445ba48c1be05bcf46c9b1d5f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/cym13
12
4
5.33333333333333
4
5.33333333333333
0
0
29
38.6666666666667
75
Commented
Commented
Python
Python
given mention words bad downvotes necessary guess think fact more
given mention words bad downvotes necessary guess think fact more
more,complex given,downvotes wonder,fair complex,statements big,part acknowledged,big horse,race given,limited think,necessary words,acknowledged
more,complex given,downvotes wonder,fair complex,statements big,part acknowledged,big horse,race given,limited think,necessary words,acknowledged
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
mcmasilmof
mcmasilmof
1
8159.57568359375
4210.76123046875
0
1
0
0.002445
0
0.002269
0
0
208
McMasilmof
2/9/2014 7:15:53 PM
0
1891
78256
0
False
False
False
False
True
False
t2_f71y1
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/McMasilmof
63
2
1.73913043478261
0
0
0
0
54
46.9565217391304
115
Commented
Commented
AskProgramming
AskProgramming
way html curl website language use search requirements people lot
way html curl website language use search requirements people lot
tries,prevent selenium,simulating use,webservice info,simple python,beautifull file,info use,programmig tricks,lot etc,work download,html
tries,prevent selenium,simulating use,webservice info,simple python,beautifull file,info use,programmig tricks,lot etc,work download,html
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
deztabilizeur
deztabilizeur
1
8159.57568359375
3687.30908203125
2
1
0
0.002445
0
0.002609
0
0
209
Deztabilizeur
8/22/2018 6:13:15 AM
0
1613
3734
0
False
False
False
False
True
False
t2_1i00srd6
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Deztabilizeur
63
3
2.5
3
2.5
0
0
46
38.3333333333333
120
Posted
Posted
AskProgramming
AskProgramming
web information best way scrap scraping looking site solution project
web information best way scrap scraping looking site solution project
best,way web,site site,print availaible,library library,screen learn,best scraping,service information,local people,askprogramming scrap,information
best,way web,site site,print availaible,library library,screen learn,best scraping,service information,local people,askprogramming scrap,information
100
https://styles.redditmedia.com/t5_fk4io/styles/profileIcon_c9ar0lzi8qd51.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=b3f6d948efa7a7a81162cae4c92ebc1c7665364f
chickenthugs
chickenthugs
1
819.451293945313
6288.90087890625
1
1
0
0
0
0.002439
0
0
210
ChickenThugs
2/8/2018 7:10:20 AM
0
826
1159
0
False
False
False
False
True
False
t2_6cbntik
False
False
False
https://styles.redditmedia.com/t5_fk4io/styles/profileIcon_c9ar0lzi8qd51.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=b3f6d948efa7a7a81162cae4c92ebc1c7665364f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ChickenThugs
1
0
0
0
0
0
0
30
55.5555555555556
54
Posted
Posted
AskProgramming
AskProgramming
website items program suggestions specific octoparse current being single info
website items program suggestions specific octoparse current being single info
current,suggestions specific,items built,tool public,price pre,built input,parameters find,pre edit,current anyone,know such,program
current,suggestions specific,items built,tool public,price pre,built input,parameters find,pre edit,current anyone,know such,program
100
https://styles.redditmedia.com/t5_7im57/styles/profileIcon_zkno2fslfe301.png?width=256&height=256&crop=256:256,smart&v=enabled&s=44f121ba25e6f9b1eb2f519a81104c8d24bba56a
credobot
credobot
1
9781.1142578125
5334.22412109375
0
1
0
0.002445
0
0.002269
0
0
211
CredoBot
4/19/2017 5:40:43 PM
0
1
7162
0
False
False
False
False
True
False
t2_177ujg
False
False
True
https://styles.redditmedia.com/t5_7im57/styles/profileIcon_zkno2fslfe301.png?width=256&height=256&crop=256:256,smart&v=enabled&s=44f121ba25e6f9b1eb2f519a81104c8d24bba56a
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/CredoBot
62
1
1.01010101010101
0
0
0
0
59
59.5959595959596
99
Commented
Commented
slavelabour
slavelabour
0d insert found dancingrobot123 redditor create text links slrep credo360
0d insert found dancingrobot123 redditor create text links slrep credo360
found,create slrep,november 0d,known reddit,slrep credo,verifications add,links dancingrobot123,sl skills,services info,dancingrobot123 profile,text
found,create slrep,november 0d,known reddit,slrep credo,verifications add,links dancingrobot123,sl skills,services info,dancingrobot123 profile,text
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
dancingrobot123
dancingrobot123
1
9781.1142578125
4807.21142578125
2
1
0
0.002445
0
0.002609
0
0
212
Dancingrobot123
11/27/2016 5:38:48 PM
0
5069
901
0
False
False
False
False
True
False
t2_134llb
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Dancingrobot123
62
3
4.34782608695652
0
0
0
0
30
43.4782608695652
69
Posted
Posted
slavelabour
slavelabour
info log link people company octoparse clicking referral profile scrape
info log link people company octoparse clicking referral profile scrape
comments,referral another,app employees,title scrape,info location,info provide,log app,widget referral,another link,company's people,tried
comments,referral another,app employees,title scrape,info location,info provide,log app,widget referral,another link,company's people,tried
100
https://styles.redditmedia.com/t5_67bnrh/styles/profileIcon_8hl4lv182ws81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=03a41983349393a230de03fe698c7c77e7837ee9
sandeepnatoo
sandeepnatoo
1
1135.37414550781
6288.90087890625
1
1
0
0
0
0.002439
0
0
213
SandeepNatoo
4/11/2022 11:46:51 AM
0
1
2
0
False
False
False
False
True
False
t2_lt57w7nn
False
False
False
https://styles.redditmedia.com/t5_67bnrh/styles/profileIcon_8hl4lv182ws81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=03a41983349393a230de03fe698c7c77e7837ee9
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/SandeepNatoo
1
47
4.88058151609553
4
0.415368639667705
0
0
504
52.3364485981308
963
Posted
Posted
u_SandeepNatoo
u_SandeepNatoo
scraping data #x200b offers web free tools users features allows
scraping data #x200b offers web free tools users features allows
web,scraping allows,users plan,starts scraping,tools scrapy,framework data,scraping free,trial data,collection paid,plan free,plan
web,scraping allows,users plan,starts scraping,tools scrapy,framework data,scraping free,trial data,collection paid,plan free,plan
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
jaypat87
jaypat87
1
9563.2294921875
4543.705078125
2
2
0
0.002445
0
0.002609
0
1
214
jaypat87
2/27/2019 2:47:02 AM
0
70
606
0
False
False
False
False
True
False
t2_36fmdn14
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/jaypat87
61
20
3.12989045383412
9
1.40845070422535
0
0
317
49.6087636932707
639
RepliedTo Posted Commented
Commented Posted RepliedTo
Entrepreneur
Entrepreneur
twitter bio cost influencers tools scraper micro search month code
cost influencers month code etc io followers pay freelancer bio
twitter,bio micro,influencers bio,scraper io,free near,atlanta twitter,search vicinitas,io exportdata,io twitter,followers download,twitter
micro,influencers twitter,bio io,free near,atlanta twitter,search vicinitas,io exportdata,io twitter,followers download,twitter tools,download
100
https://styles.redditmedia.com/t5_4gir6y/styles/profileIcon_c4vjs0h12m071.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=d675eb52d776af6053954365cd94f240cbeae10e
geezeer84
geezeer84
1
9186.748046875
5597.73095703125
1
1
0
0.002445
0
0.002269
0
1
215
geezeer84
5/21/2021 2:58:37 PM
0
743
11416
0
False
False
False
False
True
False
t2_boj0h0tn
False
False
False
https://styles.redditmedia.com/t5_4gir6y/styles/profileIcon_c4vjs0h12m071.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=d675eb52d776af6053954365cd94f240cbeae10e
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/geezeer84
61
1
5.26315789473684
0
0
0
0
10
52.6315789473684
19
Commented
Commented
Entrepreneur
Entrepreneur
registration tweets free requires api download 500 allows month twitter
registration tweets free requires api download 500 allows month twitter
registration,allows api,requires month,free requires,registration 500,000 twitter,api 000,tweets tweets,month download,500 allows,download
registration,allows api,requires month,free requires,registration 500,000 twitter,api 000,tweets tweets,month download,500 allows,download
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
educational_big_3934
educational_big_3934
1
187.605651855469
4965.671875
1
1
0
0
0
0.002439
0
0
216
Educational_Big_3934
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/Educational_Big_3934
1
Posted
Posted
videos
videos
100
https://styles.redditmedia.com/t5_3xsy8m/styles/profileIcon_snoo5bf396fe-6cae-4619-8d78-3365b907e620-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8884c69d3f1d942529f1aa4f1291506ccd900865
somecroissantswe
somecroissantswe
1
5492.9736328125
237.676864624023
1
1
0
0.008557
0
0.002158
0
1
217
somecroissantswe
2/10/2021 7:30:17 PM
0
16
843
0
False
False
False
False
True
False
t2_aa5vgd2z
False
False
False
https://styles.redditmedia.com/t5_3xsy8m/styles/profileIcon_snoo5bf396fe-6cae-4619-8d78-3365b907e620-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8884c69d3f1d942529f1aa4f1291506ccd900865
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/somecroissantswe
13
2
1.61290322580645
2
1.61290322580645
0
0
79
63.7096774193548
124
Commented
Commented
programmation
programmation
salaire sont ta ces sociétés languages cependant général mais souvent
salaire sont ta ces sociétés languages cependant général mais souvent
python,java dire,python général,auras rien,t'empêche ta,séniorité élevé,dehors t'empêche,cependant cependant,plus souvent,licornes oscillent,40
python,java dire,python général,auras rien,t'empêche ta,séniorité élevé,dehors t'empêche,cependant cependant,plus souvent,licornes oscillent,40
100
https://styles.redditmedia.com/t5_cvz6g/styles/profileIcon_snood1a02815-f802-45ae-8fd5-3cd86485d60c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ab575aea28e887f78ad55139e5c5b2f9d9cc7e00
bobbysteel
bobbysteel
1
7305.08056640625
5136.5947265625
0
1
0
0.003667
0
0.00221
0
0
218
bobbysteel
9/29/2006 9:01:22 AM
0
6370
4470
0
False
False
False
False
True
False
t2_k87d
False
False
False
https://styles.redditmedia.com/t5_cvz6g/styles/profileIcon_snood1a02815-f802-45ae-8fd5-3cd86485d60c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ab575aea28e887f78ad55139e5c5b2f9d9cc7e00
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/bobbysteel
30
0
0
0
0
0
0
4
80
5
Commented
Commented
webscraping
webscraping
fedex apis ups use
fedex apis ups use
fedex,ups ups,apis use,fedex
fedex,ups ups,apis use,fedex
128.571428571429
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
soatzotakanoshi
soatzotakanoshi
51.2201758267979
7305.08056640625
4442.21923828125
3
1
4
0.005501
0
0.002732
0
0
219
SoatzoTakanoshi
1/6/2013 4:50:40 AM
0
88
2038
0
False
False
False
False
True
False
t2_a4avt
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/SoatzoTakanoshi
30
2
2.66666666666667
0
0
0
0
33
44
75
Posted
Posted
webscraping
webscraping
ups octoparse fedex data tracking program help give appreciated possible
ups octoparse fedex data tracking program help give appreciated possible
inputs,tracking ups,fedex time,arrival fedex,ups data,anyone ups,tracking use,octoparse shipping,hundreds hundreds,things arrival,give
inputs,tracking ups,fedex time,arrival fedex,ups data,anyone ups,tracking use,octoparse shipping,hundreds hundreds,things arrival,give
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
its_kiddos
its_kiddos
1
7010.12060546875
5136.5947265625
0
1
0
0.003667
0
0.002264
0
0
220
its_kiddos
9/30/2016 8:15:10 AM
0
7329
12420
0
False
False
False
False
True
False
t2_11s232
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/its_kiddos
30
1
9.09090909090909
1
9.09090909090909
0
0
4
36.3636363636364
11
RepliedTo
RepliedTo
webscraping
webscraping
thanks work saved ass yooooo shit
thanks work saved ass yooooo shit
yooooo,thanks ass,work shit,saved thanks,shit saved,ass
yooooo,thanks ass,work shit,saved thanks,shit saved,ass
128.571428571429
https://styles.redditmedia.com/t5_lh54z/styles/profileIcon_snoo0e0313a9-dd5b-49d5-968f-9ff7e2ca4ced-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6e25201049aa3ec9520f0888b2e34577aede7b70
mehpew
mehpew
51.2201758267979
7010.12060546875
4442.21923828125
1
1
4
0.005501
0
0.002549
0
0
221
Mehpew
7/8/2018 1:09:17 PM
0
2009
43
0
False
False
False
False
True
False
t2_1qa4eosj
False
False
False
https://styles.redditmedia.com/t5_lh54z/styles/profileIcon_snoo0e0313a9-dd5b-49d5-968f-9ff7e2ca4ced-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6e25201049aa3ec9520f0888b2e34577aede7b70
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Mehpew
30
0
0
0
0
0
0
20
57.1428571428571
35
Commented
Commented
webscraping
webscraping
fedex ups tracking number bing packagetrackingv2 packnum replace carrier
fedex ups tracking number bing packagetrackingv2 packnum replace carrier
tracking,number fedex,ups bing,packagetrackingv2 packagetrackingv2,packnum packnum,tracking ups,fedex number,fedex packnum,bing carrier,fedex number,carrier
tracking,number fedex,ups bing,packagetrackingv2 packagetrackingv2,packnum packnum,tracking ups,fedex number,fedex packnum,bing carrier,fedex number,carrier
100
https://styles.redditmedia.com/t5_5u7xfs/styles/profileIcon_snoo39e7f410-3d68-4300-8f68-2d948ca40f83-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=471a1bdbba7cdad227259074d4e4e77c75ca53a4
inevitable-dish-732
inevitable-dish-732
1
2874.93481445313
7968.5205078125
0
1
0
0.023961
0.153925
0.002133
0
0
222
Inevitable-Dish-732
2/12/2022 6:26:45 AM
0
1
4
0
False
False
False
False
True
False
t2_jm1rrawo
False
False
False
https://styles.redditmedia.com/t5_5u7xfs/styles/profileIcon_snoo39e7f410-3d68-4300-8f68-2d948ca40f83-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=471a1bdbba7cdad227259074d4e4e77c75ca53a4
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Inevitable-Dish-732
3
1
8.33333333333333
0
0
0
0
5
41.6666666666667
12
Commented
Commented
Octoparse_ideas
Octoparse_ideas
marketing email used salesblink think best
marketing email used salesblink think best
email,marketing best,email salesblink,best marketing,used think,salesblink
email,marketing best,email salesblink,best marketing,used think,salesblink
228.571428571429
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
lieutenant_lowercase
lieutenant_lowercase
226.990791220591
3871.3330078125
1815.45446777344
1
2
18
0.009404
0
0.002535
0
0.5
223
lieutenant_lowercase
11/26/2012 9:28:23 PM
0
20175
14889
0
False
False
False
False
True
False
t2_9pvwh
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/lieutenant_lowercase
9
0
0
2
5.71428571428571
0
0
17
48.5714285714286
35
Commented RepliedTo
Commented RepliedTo
datascience
datascience
relied relatively scraping scraper bought screwed automated closed forget happened
relied relatively scraping scraper bought screwed automated closed forget happened
screwed,relatively everyone,relied code,scraper forget,happened automated,scraping down,leaving scraping,tool closed,down leaving,everyone trivial,code
screwed,relatively everyone,relied code,scraper forget,happened automated,scraping down,leaving scraping,tool closed,down leaving,everyone trivial,code
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
sasjkh3333
sasjkh3333
1
3774.1689453125
2321.705078125
1
1
0
0.006986
0
0.002263
0
1
224
sasjkh3333
5/26/2016 12:53:35 PM
0
16
6
0
False
False
False
False
True
False
t2_y85pc
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/sasjkh3333
9
0
0
0
0
0
0
1
50
2
RepliedTo
RepliedTo
datascience
datascience
happened
happened
328.571428571429
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
paulblack2025
paulblack2025
402.761406614383
3966.13232421875
1274.42834472656
3
1
32
0.012868
0
0.002454
0
0
225
paulblack2025
6/13/2016 8:59:49 AM
0
3
0
0
False
False
False
False
True
False
t2_yosy7
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/paulblack2025
9
0
0
0
0
0
0
4
100
4
Posted
Posted
datamining startups bigdata_analytics multihub BusinessIntelligence datascience eFreebies dataisbeautiful learnprogramming techsupport
datamining startups pics dataisbeautiful learnprogramming BusinessIntelligence Underdog_Promotions eFreebies datascience IMadeThis
removed
removed
342.857142857143
https://styles.redditmedia.com/t5_1z91vr/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV83MTQ5Nw_rare_cc7da484-9814-4271-bbb0-1fc48b39c420-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7fe5f1e00d424f81329667105649a87eae955a47
jfiney
jfiney
427.871494527782
3908.767578125
3224.5751953125
4
1
34
0.00978
0
0.002944
0
0
227
JFiney
11/15/2011 11:28:52 PM
0
7190
10398
0
False
False
False
False
True
False
t2_69184
False
False
True
https://styles.redditmedia.com/t5_1z91vr/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV83MTQ5Nw_rare_cc7da484-9814-4271-bbb0-1fc48b39c420-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7fe5f1e00d424f81329667105649a87eae955a47
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/JFiney
10
0
0
0
0
0
0
50
44.2477876106195
113
Posted
Posted
webscraping
webscraping
yellowpages search allow url state city location list business name
yellowpages search allow url state city location list business name
name,location city,state tools,scrapestorm yellowpages,yellowpages field,one data,url need,input structure,yellowpages give,list location,city
name,location city,state tools,scrapestorm yellowpages,yellowpages field,one data,url need,input structure,yellowpages give,list location,city
100
https://styles.redditmedia.com/t5_5bbieh/styles/profileIcon_n4mb5095kwy71.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b51ab7f032d62926ab714ad71d5c4d877f8ce192
scrapecrow
scrapecrow
1
3735.17333984375
3676.31274414063
0
1
0
0.007191
0
0.002184
0
0
228
scrapecrow
11/10/2021 12:00:02 PM
0
19
478
0
False
False
False
False
True
False
t2_fcucgspq
False
False
False
https://styles.redditmedia.com/t5_5bbieh/styles/profileIcon_n4mb5095kwy71.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b51ab7f032d62926ab714ad71d5c4d877f8ce192
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/scrapecrow
10
4
2.38095238095238
0
0
0
0
107
63.6904761904762
168
Commented
Commented
webscraping
webscraping
tree yellowpages text httpx css python scrape aside name website
tree yellowpages text httpx css python scrape aside name website
tree,css css,#main #main,aside httpx,parsel scrape,yellowpages aside,phone terminal,command ozumo,japanese href,address bit,python
tree,css css,#main #main,aside httpx,parsel scrape,yellowpages aside,phone terminal,command ozumo,japanese href,address bit,python
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
rakesh3368
rakesh3368
1
7600.78173828125
6630.39111328125
0
1
0
0.00326
0
0.002217
0
0
229
rakesh3368
3/31/2018 11:11:15 AM
0
1015
600
0
False
False
False
False
True
False
t2_y4wfc69
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/rakesh3368
43
1
11.1111111111111
1
11.1111111111111
0
0
3
33.3333333333333
9
Commented
Commented
webscraping
webscraping
dm help tough problem need
dm help tough problem need
problem,dm tough,problem need,help dm,need
problem,dm tough,problem need,help dm,need
114.285714285714
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
tranzadikt
tranzadikt
26.1100879133989
7600.78173828125
5989.4296875
3
1
2
0.00489
0
0.002882
0
0
230
tranzadikt
6/19/2014 8:22:00 PM
0
208
10
0
False
False
False
False
True
False
t2_h1rbx
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/tranzadikt
43
0
0
0
0
0
0
21
46.6666666666667
45
Posted
Posted
webscraping
webscraping
wsowkw come anyone picture wujcigh dm emails instance jumbled net
wsowkw come anyone picture wujcigh dm emails instance jumbled net
org,picture octoparse,instance picture,anyone net,uotlsu instance,wujcigh jumbled,octoparse solution,pay wsowkw,org come,jumbled uotlsu,wsowkw
org,picture octoparse,instance picture,anyone net,uotlsu instance,wujcigh jumbled,octoparse solution,pay wsowkw,org come,jumbled uotlsu,wsowkw
100
https://styles.redditmedia.com/t5_31eeoh/styles/profileIcon_snoo2890404c-b55e-4221-9041-9bc37e1cd5c8-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f89ad94614f1d51d7d63125e77f1835c68a0fb29
devildaniii
devildaniii
1
7837.935546875
6630.39111328125
0
1
0
0.00326
0
0.002217
0
0
231
devildaniii
8/26/2020 12:38:35 PM
0
11109
223
0
False
False
False
False
True
False
t2_7uxl8ll6
False
False
False
https://styles.redditmedia.com/t5_31eeoh/styles/profileIcon_snoo2890404c-b55e-4221-9041-9bc37e1cd5c8-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f89ad94614f1d51d7d63125e77f1835c68a0fb29
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/devildaniii
43
0
0
0
0
0
0
7
50
14
Commented
Commented
webscraping
webscraping
guess try simple using think regex extracting
guess try simple using think regex extracting
try,extracting extracting,using regex,simple think,try using,regex simple,guess
try,extracting extracting,using regex,simple think,try using,regex simple,guess
100
https://styles.redditmedia.com/t5_tj0nj/styles/profileIcon_snoo73924172-e500-4b82-83b7-1d3c1cfb26da-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ab3f6bdff56ca7bc0d2cd67852103b6850a66411
hib3
hib3
1
9781.1142578125
4210.76123046875
0
1
0
0.002445
0
0.002269
0
0
232
Hib3
12/27/2018 5:04:29 PM
0
2187
5573
0
False
False
False
False
True
False
t2_2vaqeqg7
False
False
False
https://styles.redditmedia.com/t5_tj0nj/styles/profileIcon_snoo73924172-e500-4b82-83b7-1d3c1cfb26da-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ab3f6bdff56ca7bc0d2cd67852103b6850a66411
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Hib3
60
0
0
0
0
0
0
1
100
1
Commented
Commented
lowlevelaware
lowlevelaware
よくこんなの見つけてきたなあ
よくこんなの見つけてきたなあ
100
https://styles.redditmedia.com/t5_3odjf/styles/profileIcon_snoo1c8ded84-6f42-47a9-9407-53975611869c-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d965c28da94d216d24524e7047b009c6c02e1a8b
mao1756
mao1756
1
9781.1142578125
3687.30908203125
2
1
0
0.002445
0
0.002609
0
0
233
mao1756
4/16/2017 6:22:47 AM
0
10800
22204
0
False
False
False
False
True
False
t2_1748ac
False
False
True
https://styles.redditmedia.com/t5_3odjf/styles/profileIcon_snoo1c8ded84-6f42-47a9-9407-53975611869c-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d965c28da94d216d24524e7047b009c6c02e1a8b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/mao1756
60
0
0
0
0
0
0
3
100
3
Posted
Posted
lowlevelaware
lowlevelaware
きららの作品の各号の出現数をwebサイトから取得して調べたかったんだけどプログラミングは素人に毛が生えたレベルだったから苦労した 正規表現とかxpathとかむじゅかしかった これをよく理解すれば将来の仕事効率も上がるんだろうけどなあ
きららの作品の各号の出現数をwebサイトから取得して調べたかったんだけどプログラミングは素人に毛が生えたレベルだったから苦労した 正規表現とかxpathとかむじゅかしかった これをよく理解すれば将来の仕事効率も上がるんだろうけどなあ
正規表現とかxpathとかむじゅかしかった,これをよく理解すれば将来の仕事効率も上がるんだろうけどなあ きららの作品の各号の出現数をwebサイトから取得して調べたかったんだけどプログラミングは素人に毛が生えたレベルだったから苦労した,正規表現とかxpathとかむじゅかしかった
正規表現とかxpathとかむじゅかしかった,これをよく理解すれば将来の仕事効率も上がるんだろうけどなあ きららの作品の各号の出現数をwebサイトから取得して調べたかったんだけどプログラミングは素人に毛が生えたレベルだったから苦労した,正規表現とかxpathとかむじゅかしかった
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
black_magic100
black_magic100
1
3401.73583984375
6051.86376953125
2
2
0
0.014958
0
0.002351
0
1
234
Black_Magic100
10/13/2014 4:34:46 PM
0
3458
10948
0
False
False
False
False
True
False
t2_iu6w1
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Black_Magic100
5
10
2.76243093922652
1
0.276243093922652
0
0
148
40.8839779005525
362
RepliedTo Posted
Posted RepliedTo
scrapinghub
scrapinghub
octoparse scrolling python website ajax work bitly #x200b load select
scrolling website work bitly #x200b octoparse select 30 easy login
scrolling,feature uses,ajax 15,000 000,annually list,export used,octoparse way,scrape portion,website once,select never,need
scrolling,feature 15,000 uses,ajax 000,annually list,export used,octoparse way,scrape portion,website once,select never,need
100
https://styles.redditmedia.com/t5_34bc75/styles/profileIcon_biu940pqs3n51.jpeg?width=256&height=256&crop=256:256,smart&v=enabled&s=78adfa19e54a5fd6b738a60a6601834c9b6f1811
digital_lover119
digital_lover119
1
9374.9892578125
1974.517578125
0
1
0
0.002445
0
0.002269
0
0
235
Digital_Lover119
9/14/2020 12:00:29 PM
0
1
3
0
False
False
False
False
True
False
t2_836i7a7p
False
False
False
https://styles.redditmedia.com/t5_34bc75/styles/profileIcon_biu940pqs3n51.jpeg?width=256&height=256&crop=256:256,smart&v=enabled&s=78adfa19e54a5fd6b738a60a6601834c9b6f1811
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Digital_Lover119
59
4
3.53982300884956
1
0.884955752212389
0
0
58
51.3274336283186
113
Commented
Commented
scrapinghub
scrapinghub
web scraping opinions data find lets activities digital finddatalab competitors'
web scraping opinions data find lets activities digital finddatalab competitors'
web,scraping speed,process digital,marketers cases,web lets,digital use,web marketers,quickly customers,interests quickly,find lets,keep
web,scraping speed,process digital,marketers cases,web lets,digital use,web marketers,quickly customers,interests quickly,find lets,keep
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
mike_m1989
mike_m1989
1
9374.9892578125
1451.06567382813
2
1
0
0.002445
0
0.002609
0
0
236
Mike_M1989
6/14/2016 7:35:09 AM
0
8
0
0
False
False
False
False
True
False
t2_yptx1
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Mike_M1989
59
142
3.58133669609079
37
0.933165195460277
0
0
2062
52.0050441361917
3965
Posted
Posted
scrapinghub webscraping infographic
scrapinghub webscraping infographic
data scraping web octoparse extraction financial scrape price more information
financial ecommerce finance market yahoo shipping price uses practical ##
data,extraction data,scraping financial,data web,data web,scraping ecommerce,data octoparse,blog scrape,financial scraping,tools helpcenter,octoparse
financial,data ecommerce,data scrape,financial web,data yahoo,finance practical,uses data,without free,shipping uses,ecommerce scraping,services
100
https://styles.redditmedia.com/t5_6s1ana/styles/profileIcon_snoo5111a92e-8790-41a7-81a7-98ac4f191b0b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=539207dae382606625c798707e709ddb9e531a54
limjetwee
limjetwee
1
1451.29711914063
4304.056640625
1
1
0
0
0
0.002439
0
0
237
limjetwee
7/28/2022 4:38:31 AM
0
2
0
0
False
False
False
False
True
False
t2_jelskzh2
False
False
False
https://styles.redditmedia.com/t5_6s1ana/styles/profileIcon_snoo5111a92e-8790-41a7-81a7-98ac4f191b0b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=539207dae382606625c798707e709ddb9e531a54
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/limjetwee
1
Posted
Posted
u_limjetwee
u_limjetwee
100
https://styles.redditmedia.com/t5_2cjyyo/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV81NTgyOTA_rare_5e5f60e4-cd5d-4769-8728-6844bf999bab-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6c42a4e8c72c5cfba482dbacde39a82a794bc638
saffa1986
saffa1986
1
8968.8623046875
856.395812988281
1
1
0
0.002445
0
0.002439
0
1
238
Saffa1986
1/11/2020 4:01:29 AM
0
1412
6697
0
False
False
False
False
True
False
t2_5f3nun2t
False
False
True
https://styles.redditmedia.com/t5_2cjyyo/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV81NTgyOTA_rare_5e5f60e4-cd5d-4769-8728-6844bf999bab-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6c42a4e8c72c5cfba482dbacde39a82a794bc638
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Saffa1986
58
0
0
0
0
0
0
11
50
22
RepliedTo
RepliedTo
Marketresearch
Marketresearch
product forsta play ve chat scrape light rep reviews know
product forsta play ve chat scrape light rep reviews know
light,play chat,rep rep,ve reviews,chat scrape,product play,meltwater know,forsta forsta,scrape product,reviews ve,light
light,play chat,rep rep,ve reviews,chat scrape,product play,meltwater know,forsta forsta,scrape product,reviews ve,light
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
-dikki
-dikki
1
8968.8623046875
332.943908691406
1
1
0
0.002445
0
0.002439
0
1
239
-dikki
4/19/2011 12:18:57 PM
0
2640
13034
0
False
False
False
False
True
False
t2_54kag
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/-dikki
58
1
0.900900900900901
0
0
0
0
52
46.8468468468468
111
RepliedTo
RepliedTo
Marketresearch
Marketresearch
meltwater reviews use data forsta separately look tools thanks yelp
meltwater reviews use data forsta separately look tools thanks yelp
ve,find find,local reviews,google clients,qsr maps,yelp data,elsewhere wouldn,go trial,thanks know,buy separately,particulate
ve,find find,local reviews,google clients,qsr maps,yelp data,elsewhere wouldn,go trial,thanks know,buy separately,particulate
157.142857142857
https://styles.redditmedia.com/t5_e6zqm/styles/profileIcon_snooe54e498d-8f7b-4422-a5a2-c8623217c068-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=81a52c7407b4f5b338ceae0c29054b9dc2f52333
rickrat
rickrat
101.440351653596
5653.1552734375
6638.38330078125
1
2
8
0.006792
0
0.00252
0
0.5
240
rickrat
3/10/2009 7:20:12 PM
0
15391
16684
0
False
False
False
False
True
False
t2_3eo01
False
True
False
https://styles.redditmedia.com/t5_e6zqm/styles/profileIcon_snooe54e498d-8f7b-4422-a5a2-c8623217c068-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=81a52c7407b4f5b338ceae0c29054b9dc2f52333
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/rickrat
15
0
0
0
0
0
0
10
55.5555555555556
18
Commented RepliedTo
Commented RepliedTo
csharp
csharp
site each different yes htmlsgilitypack nuget tried github haven use
site each different yes htmlsgilitypack nuget tried github haven use
different,each haven,tried each,site htmlsgilitypack,github use,htmlsgilitypack yes,different site,use github,nuget
different,each haven,tried each,site htmlsgilitypack,github use,htmlsgilitypack yes,different site,use github,nuget
100
https://styles.redditmedia.com/t5_2h9bc5/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xODAxMzg_rare_cf39386d-4229-423a-b1d1-99063ec391ee-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=97ebc95843fb64f0ed2808429f731d9bdf16f176
jtarsier
jtarsier
1
5672.06298828125
6639.212890625
1
1
0
0.004702
0
0.002262
0
1
241
JTarsier
3/8/2020 9:06:31 PM
0
24
909
0
False
False
False
False
True
False
t2_5tzqxn7v
False
False
False
https://styles.redditmedia.com/t5_2h9bc5/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xODAxMzg_rare_cf39386d-4229-423a-b1d1-99063ec391ee-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=97ebc95843fb64f0ed2808429f731d9bdf16f176
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/JTarsier
15
1
9.09090909090909
0
0
0
0
5
45.4545454545455
11
RepliedTo
RepliedTo
csharp
csharp
jquery better query selectors anglesharp css
jquery better query selectors anglesharp css
better,css anglesharp,better css,query selectors,jquery query,selectors
better,css anglesharp,better css,query selectors,jquery query,selectors
214.285714285714
https://styles.redditmedia.com/t5_dkqr8/styles/profileIcon_snoo3e362833-3a3f-4592-aa0c-12a725d25453-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=97c006be3669bd1d9be467047ec6945edd52183a
azraels_ghost
azraels_ghost
201.880703307192
5960.92041015625
6575.431640625
4
1
16
0.008732
0
0.002874
0
0
242
azraels_ghost
7/15/2014 1:34:42 PM
0
1024
7566
0
False
False
False
False
True
False
t2_heuas
False
False
False
https://styles.redditmedia.com/t5_dkqr8/styles/profileIcon_snoo3e362833-3a3f-4592-aa0c-12a725d25453-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=97c006be3669bd1d9be467047ec6945edd52183a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/azraels_ghost
15
2
2.10526315789474
0
0
0
0
43
45.2631578947368
95
Posted
Posted
csharp
csharp
url seen scraping c# things projects re anyone back something
url seen scraping c# things projects re anyone back something
believe,different maybe,keywords existing,c# profile,each wondering,anyone few,others site,looked anyone,done done,something already,examples
believe,different maybe,keywords existing,c# profile,each wondering,anyone few,others site,looked anyone,done done,something already,examples
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
philipp
philipp
1
6060.13525390625
5747.28857421875
0
1
0
0.005557
0
0.002181
0
0
243
Philipp
7/16/2007 12:25:32 AM
0
82212
92912
0
False
False
False
False
True
False
t2_26mwz
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Philipp
15
6
6.59340659340659
1
1.0989010989011
0
0
40
43.956043956044
91
Commented
Commented
csharp
csharp
html api need scenario site's cons feed article resilient translations
html api need scenario site's cons feed article resilient translations
changes,site above,article api,feed html,changes money,need pros,work knowing,html tested,approach cons,api article,pros
changes,site above,article api,feed html,changes money,need pros,work knowing,html tested,approach cons,api article,pros
157.142857142857
https://styles.redditmedia.com/t5_2hhdu7/styles/profileIcon_snoo4419fde4-38ea-4f61-9d70-992849fb6a76-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=934680c8b32d9b04306e90d8b057b1ac30ea9821
ardalanme
ardalanme
101.440351653596
6115.44921875
7217.47314453125
0
2
8
0.006792
0
0.002363
0
0
244
ardalanme
3/11/2020 6:20:43 PM
0
8
3
0
False
False
False
False
True
False
t2_5egec6s7
False
False
False
https://styles.redditmedia.com/t5_2hhdu7/styles/profileIcon_snoo4419fde4-38ea-4f61-9d70-992849fb6a76-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=934680c8b32d9b04306e90d8b057b1ac30ea9821
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ardalanme
15
3
6.97674418604651
0
0
0
0
15
34.8837209302326
43
Commented
Commented
u_outsourcebigdata csharp
csharp u_outsourcebigdata
browse ai month less pretty c# api outsourcebigdata scrapes check
month less pretty c# api outsourcebigdata scrapes check use 200
browse,ai check,browse 200,scrapes scrapes,month ai,browse api,c# use,api outsourcebigdata,check month,browse free,pretty
check,browse 200,scrapes scrapes,month ai,browse api,c# use,api outsourcebigdata,check month,browse free,pretty ai,free
100
https://styles.redditmedia.com/t5_3nm37/styles/profileIcon_snoo1539a1ed-ef48-4c66-8711-8970f6134942-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=13c591152f156de756395107ab27d6b81ef903de
outsourcebigdata
outsourcebigdata
1
6263.82666015625
7798.365234375
2
1
0
0.004702
0
0.002433
0
0
245
outsourcebigdata
8/13/2017 3:10:10 PM
0
1
0
0
False
False
False
False
True
False
t2_ade83he
False
False
True
https://styles.redditmedia.com/t5_3nm37/styles/profileIcon_snoo1539a1ed-ef48-4c66-8711-8970f6134942-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=13c591152f156de756395107ab27d6b81ef903de
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/outsourcebigdata
15
125
4.71164719185827
21
0.79155672823219
0
0
1535
57.8590275160196
2653
Posted
Posted
u_outsourcebigdata
u_outsourcebigdata
data scraping amazon web scraper software free use using tool
amazon product tools website service extractor automate fetch prices version
web,scraping amazon,scraper data,scraping scraping,software web,scraper amazon,scraping scraping,tools scraper,amazon website,data ai,powered
amazon,scraper scraping,software web,scraper amazon,scraping scraping,tools scraper,amazon website,data scraper,software scraping,service data,extractor
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
smooth-solution4108
smooth-solution4108
1
3735.17333984375
731.451416015625
2
1
0
0.01063
0
0.002315
0
0
246
Smooth-Solution4108
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/Smooth-Solution4108
9
0
0
0
0
0
0
1
100
1
Posted
Posted
PiratedGames
PiratedGames
removed
removed
100
https://styles.redditmedia.com/t5_ndiew/styles/profileIcon_snoo1ee8ceff-b14d-4a7b-a964-ca4c149ae51d-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0b05f82ca3c7780b2893914b38d8d13c5b9b157b
jrusse
jrusse
1
1767.21984863281
4304.056640625
1
1
0
0
0
0.002439
0
0
247
jrusse
8/18/2018 2:37:52 PM
0
1717
1287
0
False
False
False
False
True
False
t2_20evjamk
False
False
False
https://styles.redditmedia.com/t5_ndiew/styles/profileIcon_snoo1ee8ceff-b14d-4a7b-a964-ca4c149ae51d-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0b05f82ca3c7780b2893914b38d8d13c5b9b157b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/jrusse
1
0
0
0
0
0
0
78
48.4472049689441
161
Posted
Posted
webscraping
webscraping
tutorial sub category items octoparse find block tried item each
tutorial sub category items octoparse find block tried item each
sub,elements video,attempt data,imgur elements,highlighted category,see highlighted,anyone bit,python same,category showing,green duplicate,sub
sub,elements video,attempt data,imgur elements,highlighted category,see highlighted,anyone bit,python same,category showing,green duplicate,sub
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
prodiver
prodiver
1
7597.076171875
2465.92138671875
0
1
0
0.00326
0
0.002217
0
0
248
prodiver
2/22/2012 11:28:08 AM
0
2549
177441
0
False
False
False
False
True
False
t2_6zlkz
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/prodiver
42
2
6.45161290322581
0
0
0
0
17
54.8387096774194
31
Commented
Commented
webscraping
webscraping
site going instead rendered free scrape something pure javascript probably
site going instead rendered free scrape something pure javascript probably
selenium,puppeteerjs something,selenium need,something rendered,dynamically online,scrapers free,online pure,html scrapers,going html,need dynamically,javascript
selenium,puppeteerjs something,selenium need,something rendered,dynamically online,scrapers free,online pure,html scrapers,going html,need dynamically,javascript
114.285714285714
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
nertynertt
nertynertt
26.1100879133989
7597.076171875
1800.03369140625
3
1
2
0.00489
0
0.002882
0
0
249
nertynertt
5/22/2017 10:20:18 PM
0
4334
20314
0
False
False
False
False
True
False
t2_1uy7m8q
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/nertynertt
42
3
4
1
1.33333333333333
0
0
28
37.3333333333333
75
Posted
Posted
webscraping
webscraping
making sort think still mission logs normal scraping trial boss
making sort think still mission logs normal scraping trial boss
done,sort parsehub,seem still,free boss,sent mission,scrape refreshes,logs before,starting logs,making work,constantly making,frustrating
done,sort parsehub,seem still,free boss,sent mission,scrape refreshes,logs before,starting logs,making work,constantly making,frustrating
100
https://styles.redditmedia.com/t5_90q7o/styles/profileIcon_gub89x8c3uf71.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=4568668a92989f0782977e44448ca0dcc0e911ef
fullmetalmahnmut
fullmetalmahnmut
1
7826.81884765625
2465.92138671875
0
1
0
0.00326
0
0.002217
0
0
250
FullMetalMahnmut
5/6/2017 6:03:16 AM
0
255
1159
0
False
False
False
False
True
False
t2_f38wkb
False
False
False
https://styles.redditmedia.com/t5_90q7o/styles/profileIcon_gub89x8c3uf71.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=4568668a92989f0782977e44448ca0dcc0e911ef
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/FullMetalMahnmut
42
2
15.3846153846154
0
0
0
0
5
38.4615384615385
13
Commented
Commented
webscraping
webscraping
callback javascript splash work well scrapy python
callback javascript splash work well scrapy python
splash,callback work,well python,work callback,python javascript,scrapy scrapy,splash
splash,callback work,well python,work callback,python javascript,scrapy scrapy,splash
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
digitalalchemist_
digitalalchemist_
1
8968.8623046875
1451.06567382813
0
1
0
0.002445
0
0.002439
0
0
251
digitalAlchemist_
1/2/2015 2:49:25 PM
0
1
19
0
False
False
False
False
True
False
t2_kgkrw
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/digitalAlchemist_
57
0
0
0
0
0
0
4
26.6666666666667
15
RepliedTo
RepliedTo
socialmedia
socialmedia
insight happen make give
insight happen make give
make,happen insight,make give,insight
make,happen insight,make give,insight
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
digital_unicorn_ca
digital_unicorn_ca
1
8968.8623046875
1974.517578125
1
0
0
0.002445
0
0.002439
0
0
252
digital_unicorn_ca
11/3/2016 7:14:07 PM
0
161
233
0
False
False
False
False
True
False
t2_12kpxz
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/digital_unicorn_ca
57
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
born-project89757
born-project89757
1
9781.1142578125
856.395812988281
0
1
0
0.002445
0
0.002269
0
0
253
Born-Project89757
3/3/2022 11:17:33 AM
0
1
0
0
False
False
False
False
True
False
t2_kadfa5s5
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Born-Project89757
56
1
5.55555555555556
0
0
0
0
8
44.4444444444444
18
Commented
Commented
u_digitally_rajat
u_digitally_rajat
sharing think scraping try web tool good thanks scrapestorm
sharing think scraping try web tool good thanks scrapestorm
scrapestorm,good scraping,tool good,web think,scrapestorm tool,try sharing,think thanks,sharing web,scraping
scrapestorm,good scraping,tool good,web think,scrapestorm tool,try sharing,think thanks,sharing web,scraping
100
https://styles.redditmedia.com/t5_45mpgg/styles/profileIcon_snoo035b698c-c1de-499a-84c2-1a59330a468f-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2e7ff99c5f3e1c24043e63649e1e4038bbac10b5
digitally_rajat
digitally_rajat
1
9781.1142578125
332.943908691406
2
1
0
0.002445
0
0.002609
0
0
254
digitally_rajat
3/25/2021 7:33:28 AM
0
169
48
0
False
False
False
False
True
False
t2_b4n8zwb1
False
False
True
https://styles.redditmedia.com/t5_45mpgg/styles/profileIcon_snoo035b698c-c1de-499a-84c2-1a59330a468f-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2e7ff99c5f3e1c24043e63649e1e4038bbac10b5
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/digitally_rajat
56
58
4.3186895011169
10
0.744601638123604
0
0
731
54.4303797468354
1343
Posted
Posted
u_digitally_rajat
u_digitally_rajat
data web scraping tools features api scrape pricing tool best
data web scraping tools features api scrape pricing tool best
web,scraping scraping,tools newsdata,io web,scraper plans,start best,web data,scraping news,data scraping,tool features,features
web,scraping scraping,tools newsdata,io web,scraper plans,start best,web data,scraping news,data scraping,tool features,features
100
https://styles.redditmedia.com/t5_43x35p/styles/profileIcon_snooa3f062bd-9bbd-48de-ac33-7052116136af-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=aafe4954ea35482d2f49d284e2a1deb1024c7b79
web_scraping_corps
web_scraping_corps
1
7788.5283203125
512.769226074219
0
1
0
0.00326
0
0.002217
0
0
255
web_scraping_corps
3/15/2021 2:33:32 PM
0
67
52
0
False
False
False
False
True
False
t2_8k1gqynt
False
False
False
https://styles.redditmedia.com/t5_43x35p/styles/profileIcon_snooa3f062bd-9bbd-48de-ac33-7052116136af-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=aafe4954ea35482d2f49d284e2a1deb1024c7b79
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/web_scraping_corps
41
0
0
0
0
0
0
9
56.25
16
Commented
Commented
webscraping
webscraping
proxies detected already multiple using happen few unless
proxies detected already multiple using happen few unless
proxies,unless happen,multiple using,few multiple,proxies already,detected unless,using proxies,already few,proxies
proxies,unless happen,multiple using,few multiple,proxies already,detected unless,using proxies,already few,proxies
114.285714285714
https://styles.redditmedia.com/t5_3jo9s9/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYjljMDQyYzMyNzViYzQ5Nzk5Njg4ZWVhMWEyOWIxNDA1ZDAyOTQ2Yl8yMzkyOTk_rare_88f0a81e-ec76-4f1b-ab69-85f7e396c0b5-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=495db33cea9ae512ce5fc0ced06c67fcb6c2686b
independent-savings1
independent-savings1
26.1100879133989
7941.68994140625
71.2179489135742
3
2
2
0.00489
0
0.002882
0
0.5
256
Independent-Savings1
12/11/2020 7:54:03 PM
0
916
92
0
False
False
False
False
True
False
t2_9a4gxmp0
False
False
False
https://styles.redditmedia.com/t5_3jo9s9/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYjljMDQyYzMyNzViYzQ5Nzk5Njg4ZWVhMWEyOWIxNDA1ZDAyOTQ2Yl8yMzkyOTk_rare_88f0a81e-ec76-4f1b-ab69-85f7e396c0b5-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=495db33cea9ae512ce5fc0ced06c67fcb6c2686b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Independent-Savings1
41
0
0
1
1.5625
0
0
31
48.4375
64
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
proxy agent user octoparse residential used appears find website enter
used appears find website enter please iproyal zillow provide screen
user,agent residential,proxy proxy,octoparse enter,zillow please,know used,residential proxy,user provide,iproyal agent,find proxy,provide
proxy,octoparse enter,zillow please,know used,residential proxy,user provide,iproyal agent,find proxy,provide change,user iproyal,used
100
https://styles.redditmedia.com/t5_6iw1vu/styles/profileIcon_snoo25c29703-926f-496a-b96c-75a531cde4d3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=74fa931e31cf988ce0a7157f59ae1861cb847c7a
proxyempire_io
proxyempire_io
1
7482.20458984375
1395.87182617188
1
1
0
0.00326
0
0.002217
0
1
257
ProxyEmpire_io
6/13/2022 12:28:22 PM
0
1
10
0
False
False
False
False
True
False
t2_ok764oxg
False
False
True
https://styles.redditmedia.com/t5_6iw1vu/styles/profileIcon_snoo25c29703-926f-496a-b96c-75a531cde4d3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=74fa931e31cf988ce0a7157f59ae1861cb847c7a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ProxyEmpire_io
41
2
2.32558139534884
4
4.65116279069767
0
0
39
45.3488372093023
86
Commented RepliedTo
Commented RepliedTo
webscraping
webscraping
agents issue proxies user give suggest changing list using solve
agents issue proxies user give suggest changing list using issues
user,agents agents,list changing,user using,datacenter proxies,suggest proxies,instead suggest,changing try,happy simply,changing issue,proxies
user,agents agents,list changing,user using,datacenter proxies,suggest proxies,instead suggest,changing try,happy simply,changing issue,proxies
100
https://styles.redditmedia.com/t5_dhukl/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMjQ5Nzk5_rare_a8cf0dfa-cdc2-48a6-a507-843df288edbd-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=79a15c86d36bf0a1d8286f62fbe6f77e66e4c1f8
clxyder
clxyder
1
2929.7314453125
3325.87817382813
1
1
0
0.016436
0
0.00217
0
1
258
clxyder
10/2/2015 2:12:52 PM
0
1
36
0
False
False
False
False
True
False
t2_qw2mg
False
False
False
https://styles.redditmedia.com/t5_dhukl/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMjQ5Nzk5_rare_a8cf0dfa-cdc2-48a6-a507-843df288edbd-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=79a15c86d36bf0a1d8286f62fbe6f77e66e4c1f8
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/clxyder
5
0
0
0
0
0
0
8
61.5384615384615
13
Commented
Commented
webscraping
webscraping
melisa pensado youtube video buen hacer articulo gracias
melisa pensado youtube video buen hacer articulo gracias
video,youtube hacer,video pensado,hacer melisa,buen gracias,melisa buen,articulo articulo,pensado
video,youtube hacer,video pensado,hacer melisa,buen gracias,melisa buen,articulo articulo,pensado
100
https://styles.redditmedia.com/t5_29xjit/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYjljMDQyYzMyNzViYzQ5Nzk5Njg4ZWVhMWEyOWIxNDA1ZDAyOTQ2Yl83NjA1MA_rare_d1b6846a-3346-4516-8fdd-901a8b7b2a9b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e1805ac5f51c9853f96825fa424a9928b52f6ec3
dedpul218
dedpul218
1
819.451293945313
4304.056640625
1
1
0
0
0
0.002439
0
0
259
dedpul218
12/7/2019 9:03:55 PM
0
320
543
0
False
False
False
False
True
False
t2_56evesxc
False
False
False
https://styles.redditmedia.com/t5_29xjit/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYjljMDQyYzMyNzViYzQ5Nzk5Njg4ZWVhMWEyOWIxNDA1ZDAyOTQ2Yl83NjA1MA_rare_d1b6846a-3346-4516-8fdd-901a8b7b2a9b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e1805ac5f51c9853f96825fa424a9928b52f6ec3
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/dedpul218
1
0
0
0
0
0
0
9
42.8571428571429
21
Posted Commented
Commented Posted
webscraping
webscraping
city bangalore swiggy scrape webpage trying
city bangalore swiggy scrape webpage trying
swiggy,city city,bangalore trying,scrape bangalore,swiggy webpage,trying bangalore,webpage
swiggy,city city,bangalore trying,scrape bangalore,swiggy webpage,trying bangalore,webpage
100
https://styles.redditmedia.com/t5_k7bm0/styles/profileIcon_snoo2d7b4c5b-34d1-4e79-9832-20312b643cc7-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c8c752b65be238aef809335afb0fba59833a582e
josemontano
josemontano
1
2802.39282226563
3442.31298828125
0
1
0
0.016436
0
0.00217
0
0
260
josemontano
6/9/2018 12:08:53 AM
0
500
337
0
False
False
False
False
True
False
t2_1it97qi0
False
False
True
https://styles.redditmedia.com/t5_k7bm0/styles/profileIcon_snoo2d7b4c5b-34d1-4e79-9832-20312b643cc7-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c8c752b65be238aef809335afb0fba59833a582e
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/josemontano
5
0
0
0
0
0
0
3
100
3
Commented
Commented
webscraping
webscraping
saludos buen tema
saludos buen tema
buen,tema tema,saludos
buen,tema tema,saludos
100
https://styles.redditmedia.com/t5_2wclkr/styles/profileIcon_snoodcf74b1c-6f41-426a-bf93-1b4401c3cff2-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=73f3147f0895ca38881998c37a97ab86727b3272
hellish_reader
hellish_reader
1
1282.30651855469
784.0546875
1
1
0
0.023566
0
0.002148
0
1
261
hellish_reader
7/23/2020 12:26:46 PM
0
248
47
0
False
False
False
False
True
False
t2_7dsfbqoc
False
False
False
https://styles.redditmedia.com/t5_2wclkr/styles/profileIcon_snoodcf74b1c-6f41-426a-bf93-1b4401c3cff2-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=73f3147f0895ca38881998c37a97ab86727b3272
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/hellish_reader
2
1
8.33333333333333
0
0
0
0
9
75
12
Commented RepliedTo
Commented RepliedTo
webscraping
webscraping
using lists asin amazon scrape welcome used data scrapy python
using lists asin amazon scrape welcome used data scrapy python
data,using python,scrapy used,python using,asin amazon,data scrape,amazon asin,lists scrapy,scrape
data,using python,scrapy used,python using,asin amazon,data scrape,amazon asin,lists scrapy,scrape
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
jacksonsmomma06
jacksonsmomma06
6232.48681299016
1119.88061523438
1170.482421875
6
3
496.333333
0.030803
2E-06
0.003011
0
0.4
262
jacksonsmomma06
1/5/2021 3:07:36 AM
0
1
0
0
False
False
False
False
True
False
t2_8tyxbr20
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/jacksonsmomma06
2
5
8.92857142857143
0
0
0
0
18
32.1428571428571
56
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
product give octoparse tips extract far text better suggest thank
product give octoparse tips extract far text better suggest thank
correct,product work,search tried,using sweet,give workflow,selecting product,extract text,data octoparse,workflow please,suggest search,tips
correct,product work,search tried,using sweet,give workflow,selecting product,extract text,data octoparse,workflow please,suggest search,tips
1000
https://styles.redditmedia.com/t5_4qjeia/styles/profileIcon_2owzst0qyia71.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=b70ad32ea437d387315de212e4291c29a9647281
dana_os
dana_os
1875.88656001544
976.953918457031
656.573486328125
0
2
149.333333
0.028348
1E-06
0.00222
0
0
263
Dana_OS
7/10/2021 1:13:42 PM
0
2
5
0
False
False
False
False
True
False
t2_d7yvq4hu
False
False
False
https://styles.redditmedia.com/t5_4qjeia/styles/profileIcon_2owzst0qyia71.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=b70ad32ea437d387315de212e4291c29a9647281
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Dana_OS
2
2
8
0
0
0
0
11
44
25
Commented RepliedTo
Commented RepliedTo
webscraping learnprogramming
webscraping learnprogramming
google service although need scraper focus amazon main many services
google service although need scraper focus amazon main many services
although,main need,free many,items items,need google,services focus,google amazon,scraper recommend,service service,although main,focus
although,main need,free many,items items,need google,services focus,google amazon,scraper recommend,service service,although main,focus
1000
https://styles.redditmedia.com/t5_ju4fp/styles/profileIcon_vne9zxayfjr11.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9ed6201a62f40bb1c1df0455b151642e1631c18d
scrapestorm
scrapestorm
3340.64169248206
1562.94836425781
1187.55236816406
0
2
266
0.025907
1E-06
0.002228
0
0
264
scrapestorm
5/31/2018 6:22:42 AM
0
3
0
0
False
False
False
False
True
False
t2_1ha6s4hg
False
False
False
https://styles.redditmedia.com/t5_ju4fp/styles/profileIcon_vne9zxayfjr11.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9ed6201a62f40bb1c1df0455b151642e1631c18d
False
False
False
True
Open Reddit Page for This Person
https://www.reddit.com/user/scrapestorm
2
1
3.44827586206897
0
0
0
0
12
41.3793103448276
29
Commented
Commented
webscraping
webscraping
scrapestorm scraping try tool sharing great thanks hi
sharing great thanks hi scrapestorm scraping try tool
scraping,tool try,scrapestorm scrapestorm,scrapestorm hi,try thanks,sharing sharing,try try,scraping scrapestorm,great tool,scrapestorm great,scraping
try,scrapestorm scrapestorm,scrapestorm hi,try thanks,sharing sharing,try try,scraping scrapestorm,great tool,scrapestorm great,scraping scraping,tool
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
victoravb
victoravb
9521.90832964542
780.388610839844
1177.578125
2
4
758.333333
0.034619
2E-06
0.002387
0
0.5
265
VictorAVB
6/28/2020 3:33:25 PM
0
71
2
0
False
False
False
False
True
False
t2_5xsa97ek
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/VictorAVB
2
5
3.47222222222222
0
0
0
0
85
59.0277777777778
144
Commented
Commented
webscraping learnprogramming
learnprogramming webscraping
booking pde webautomation io extractor data 87 business 217 90
booking extractor pde webautomation io 87 business 217 90 search
webautomation,io io,pde pde,booking booking,search google,maps ease,217 extractor,87 web,scraper extract,business scraper,extract
pde,booking webautomation,io io,pde booking,search google,maps ease,217 extractor,87 web,scraper extract,business scraper,extract
1000
https://styles.redditmedia.com/t5_7i1nd/styles/profileIcon_0nzkib1ohyg81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=5ea00ee48de67e851d4f9acd7db595d78d80ddbc
promptcloud
promptcloud
3118.83591172869
1066.45776367188
1841.06750488281
0
3
248.333333
0.027357
1E-06
0.002344
0
0
266
promptcloud
5/30/2013 4:35:00 AM
0
29
67
0
False
False
False
False
True
False
t2_buu8q
False
False
True
https://styles.redditmedia.com/t5_7i1nd/styles/profileIcon_0nzkib1ohyg81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=5ea00ee48de67e851d4f9acd7db595d78d80ddbc
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/promptcloud
2
9
1.91489361702128
7
1.48936170212766
0
0
241
51.2765957446809
470
Commented
Commented
webscraping
webscraping
scraping web service data more link tool scrape tools requirements
promptcloud services vs amazon asin blog know way go huge
web,scraping scraping,service link,help scraping,tools tool,link service,provider between,web hope,helps scraping,tool service,tool
scraping,tool vs,web promptcloud,blog amazon,asin tool,vs scraping,services data,amazon blog,web way,go legal,troubles
100
https://styles.redditmedia.com/t5_2hhf87/styles/profileIcon_snoo89377143-cd7d-41e4-b3ca-56c877702001-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8814d511dc57cc65e961e6f84eafb3c86a76d81c
lilforeskin8
lilforeskin8
1
1135.37414550781
4304.056640625
1
1
0
0
0
0.002439
0
0
267
Lilforeskin8
3/11/2020 6:36:38 PM
0
1
0
0
False
False
False
False
True
False
t2_4o4ys3ep
False
False
False
https://styles.redditmedia.com/t5_2hhf87/styles/profileIcon_snoo89377143-cd7d-41e4-b3ca-56c877702001-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8814d511dc57cc65e961e6f84eafb3c86a76d81c
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Lilforeskin8
1
5
3.59712230215827
1
0.719424460431655
0
0
50
35.9712230215827
139
Posted
Posted
webscraping
webscraping
#x200b octoparse ofiart software website download correct workflow thanks trying
#x200b octoparse ofiart software website download correct workflow thanks trying
octoparse,download correct,workflow know,software everyone,well well,fact happens,very ofiart,ofiart 250,payment program,support trying,pick
octoparse,download correct,workflow know,software everyone,well well,fact happens,very ofiart,ofiart 250,payment program,support trying,pick
100
https://styles.redditmedia.com/t5_iyckj/styles/profileIcon_snoo3af10a01-d286-49b8-8f25-bda4b3acca55-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5a38c7c57858ee3862e8cdbecf4ab76925cb462f
theelectricslide2
theelectricslide2
1
465.827453613281
3240.41674804688
0
1
0
0.015401
0
0.002264
0
0
268
TheElectricSlide2
5/8/2018 11:53:16 AM
0
2189
27408
0
False
False
False
False
True
False
t2_1bwcu6ye
False
False
False
https://styles.redditmedia.com/t5_iyckj/styles/profileIcon_snoo3af10a01-d286-49b8-8f25-bda4b3acca55-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5a38c7c57858ee3862e8cdbecf4ab76925cb462f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/TheElectricSlide2
2
1
3.33333333333333
0
0
0
0
11
36.6666666666667
30
RepliedTo
RepliedTo
webscraping
webscraping
scraping consider addition agree wondering person 6000 questions team good
scraping consider addition agree wondering person 6000 questions team good
person,task wondering,consider addition,wondering scraping,6000 sites,person questions,addition good,questions consider,scraping agree,good 6000,sites
person,task wondering,consider addition,wondering scraping,6000 sites,person questions,addition good,questions consider,scraping agree,good 6000,sites
657.142857142857
https://styles.redditmedia.com/t5_2fx6zq/styles/profileIcon_snooa4ad1121-131c-42d9-a5b0-e4fba2c7115c-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e76b02399ebcd08a18614f9684b1056db1071e36
makedatauseful
makedatauseful
980.293428622559
727.856079101563
2948.44653320313
1
1
78
0.018195
0
0.002538
0
0
269
makedatauseful
2/21/2020 12:39:10 AM
0
3428
792
0
False
False
False
False
True
False
t2_5qn3ekda
False
False
False
https://styles.redditmedia.com/t5_2fx6zq/styles/profileIcon_snooa4ad1121-131c-42d9-a5b0-e4fba2c7115c-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e76b02399ebcd08a18614f9684b1056db1071e36
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/makedatauseful
2
0
0
0
0
0
0
23
39.6551724137931
58
Commented
Commented
webscraping
webscraping
websites 000 data ongoing project requirement budget one obtaining variation
websites 000 data ongoing project requirement budget one obtaining variation
000,websites websites,identical points,one data,ongoing presented,know requirement,000 obtaining,data know,websites variation,data data,points
000,websites websites,identical points,one data,ongoing presented,know requirement,000 obtaining,data know,websites variation,data data,points
1000
https://styles.redditmedia.com/t5_28v89n/styles/profileIcon_snoo88a7909f-45d8-409f-be65-a629b95b8937-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d6c01f551d7e36a8609962d9bbf2d602106aab10
jackphumphrey
jackphumphrey
1909.36668141832
900.996154785156
2466.837890625
3
1
152
0.021977
0
0.002506
0
0
270
jackphumphrey
11/22/2019 5:17:19 AM
0
1011
8736
0
False
False
False
False
True
False
t2_4xv8ztvw
False
False
True
https://styles.redditmedia.com/t5_28v89n/styles/profileIcon_snoo88a7909f-45d8-409f-be65-a629b95b8937-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d6c01f551d7e36a8609962d9bbf2d602106aab10
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/jackphumphrey
2
8
4.1025641025641
1
0.512820512820513
0
0
85
43.5897435897436
195
Posted
Posted
webscraping
webscraping
data willing pay file price community scraping etc jack io
data willing pay file price community scraping etc jack io
potp,io jack,potp willing,pay rules,scraping links,compiled pay,servers etc,use digital,ocean sorry,mods webapp,app
potp,io jack,potp willing,pay rules,scraping links,compiled pay,servers etc,use digital,ocean sorry,mods webapp,app
100
https://styles.redditmedia.com/t5_byskj/styles/profileIcon_tmhagc23uea01.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=1958f6c0b0a494ecfd018e928c72845249e931a9
herr_major
herr_major
1
503.528503417969
3642.44189453125
1
1
0
0
0
0.002439
0
0
271
Herr_Major
1/16/2018 9:48:20 AM
0
3170
20
0
False
False
False
False
True
False
t2_s8hyfl0
False
False
False
https://styles.redditmedia.com/t5_byskj/styles/profileIcon_tmhagc23uea01.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=1958f6c0b0a494ecfd018e928c72845249e931a9
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Herr_Major
1
15
3.28947368421053
7
1.53508771929825
0
0
194
42.5438596491228
456
Posted
Posted
u_Herr_Major
u_Herr_Major
amazon see excel formulas know business remember data today collection
amazon see excel formulas know business remember data today collection
years,ago data,collection failed,remember today,see amazon,amazon microsoft,guide times,old time,start future,already 15,18
years,ago data,collection failed,remember today,see amazon,amazon microsoft,guide times,old time,start future,already 15,18
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
shocka_locka
shocka_locka
1
819.451293945313
3642.44189453125
1
1
0
0
0
0.002439
0
0
272
shocka_locka
8/15/2019 7:10:19 PM
0
1145
6536
0
False
False
False
False
True
False
t2_4dkonwzi
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/shocka_locka
1
1
2.04081632653061
1
2.04081632653061
0
0
17
34.6938775510204
49
Posted
Posted
webscraping
webscraping
date look shown bottom identifier button scrap earlier update differences
date look shown bottom identifier button scrap earlier update differences
earlier,button between,thank unique,identifier look,code identifier,shown shown,look url,change date,date code,find date,update
earlier,button between,thank unique,identifier look,code identifier,shown shown,look url,change date,date code,find date,update
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
bruce_wayne89
bruce_wayne89
1
2083.142578125
4304.056640625
1
1
0
0
0
0.002439
0
0
273
Bruce_wayne89
11/21/2016 1:15:50 AM
0
2224
4785
0
False
False
False
False
True
False
t2_12yim1
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Bruce_wayne89
1
0
0
2
4.08163265306122
0
0
19
38.7755102040816
49
Posted
Posted
scrapinghub
scrapinghub
indie anyone problem hey hacker shows quote octoparse company random
indie anyone problem hey hacker shows quote octoparse company random
indie,hacker loading,screen list,company octoparse,shows random,quote everyone,scrap know,fix fix,thanks company,websites problem,octoparse
indie,hacker loading,screen list,company octoparse,shows random,quote everyone,scrap know,fix fix,thanks company,websites problem,octoparse
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
r3crac
r3crac
1
187.605651855469
3642.44189453125
1
1
0
0
0
0.002439
0
0
274
r3crac
12/26/2012 12:39:48 PM
0
2379
61
0
False
False
False
False
True
False
t2_a08ey
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/r3crac
1
0
0
0
0
0
0
4
80
5
Posted
Posted
couponsfromchina
couponsfromchina
link here's coupon octoparse
link here's coupon octoparse
here's,link coupon,octoparse octoparse,here's
here's,link coupon,octoparse octoparse,here's
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
sameagainthesecond
sameagainthesecond
1
1135.37414550781
4965.671875
1
1
0
0
0
0.002439
0
0
275
SameAgainTheSecond
1/4/2016 4:33:06 PM
0
40
65
0
False
False
False
False
True
False
t2_th5rr
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/SameAgainTheSecond
1
8
1.88679245283019
4
0.943396226415094
0
0
183
43.1603773584906
424
Posted
Posted
ProgrammingDiscussion
ProgrammingDiscussion
#x200b cite use links web case maby sure search check
#x200b cite use links web case maby sure search check
#x200b,use use,case search,links check,cite case,find loads,links cite,loads #x200b,maby sure,place learn,tools
#x200b,use use,case search,links check,cite case,find loads,links cite,loads #x200b,maby sure,place learn,tools
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
gramdel
gramdel
1
29.6442337036133
1400.35681152344
0
1
0
0.021146
0
0.00217
0
0
276
gramdel
6/25/2016 5:39:47 PM
0
1
12609
0
False
False
False
False
True
False
t2_z0ale
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/gramdel
2
0
0
0
0
0
0
19
51.3513513513514
37
Commented
Commented
learnprogramming
learnprogramming
00 time milliseconds unix figure although example 1970 date hand
00 time milliseconds unix figure although example 1970 date hand
00,00 hand,milliseconds 00,utc date,time milliseconds,convert unix,time convert,date january,1970 figure,hand utc,january
00,00 hand,milliseconds 00,utc date,time milliseconds,convert unix,time convert,date january,1970 figure,hand utc,january
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
insertalias
insertalias
1
77.1873092651367
1668.83850097656
0
1
0
0.021146
0
0.00217
0
0
277
insertAlias
5/20/2008 6:33:47 PM
0
1590
128724
0
False
False
False
False
True
False
t2_35f3l
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/insertAlias
2
0
0
0
0
0
0
15
55.5555555555556
27
Commented
Commented
learnprogramming
learnprogramming
2021 13 seconds agree example milliseconds epoch thu 34 utc
2021 13 seconds agree example milliseconds epoch thu 34 utc
1620945297000,equivalent 57,utc example,1620945297000 milliseconds,seconds agree,epoch timestamps,milliseconds epoch,timestamps 22,34 seconds,example 13,2021
1620945297000,equivalent 57,utc example,1620945297000 milliseconds,seconds agree,epoch timestamps,milliseconds epoch,timestamps 22,34 seconds,example 13,2021
100
https://styles.redditmedia.com/t5_3vhmyf/styles/profileIcon_snoo20f7d5bc-f965-4557-bce9-1bd91e950b80-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=056938a9823e3980976f4be919e799a78f91efec
oukaili80
oukaili80
1
2829.50048828125
6516.4423828125
0
1
0
0.01604
0
0.002204
0
0
278
oukaili80
2/1/2021 12:20:57 PM
0
1
5
0
False
False
False
False
True
False
t2_97nw7x3v
False
False
False
https://styles.redditmedia.com/t5_3vhmyf/styles/profileIcon_snoo20f7d5bc-f965-4557-bce9-1bd91e950b80-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=056938a9823e3980976f4be919e799a78f91efec
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/oukaili80
5
1
5.88235294117647
0
0
0
0
7
41.1764705882353
17
Commented
Commented
webscraping
webscraping
need dm octaparse help depending happy extract
need dm octaparse help depending happy extract
octaparse,happy extract,need help,dm depending,need need,extract happy,help need,octaparse
octaparse,happy extract,need help,dm depending,need need,extract happy,help need,octaparse
557.142857142857
https://styles.redditmedia.com/t5_908cd/styles/profileIcon_snoob4f6dbd9-964e-4aee-a22d-5d66f53cde8f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4a30ca1f1639d814d1aed413b57bbb53afdddf52
liveehlearn
liveehlearn
804.522813228766
2868.25415039063
5667.478515625
3
1
64
0.01987
1E-06
0.002619
0
0
279
LiveEhLearn
5/13/2017 4:27:12 PM
0
102
508
0
False
False
False
False
True
False
t2_11c6sw8
False
False
False
https://styles.redditmedia.com/t5_908cd/styles/profileIcon_snoob4f6dbd9-964e-4aee-a22d-5d66f53cde8f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4a30ca1f1639d814d1aed413b57bbb53afdddf52
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/LiveEhLearn
5
0
0
0
0
0
0
34
45.3333333333333
75
Posted Commented
Commented Posted
webscraping
webscraping
twitter api minds same help found thanks octoparse dm everyone
minds same help found thanks octoparse dm everyone handle selected
twitter,api minds,figure same,twitter api,account selected,during figure,re everyone,realized application,found future,pick help,dm
minds,figure same,twitter api,account selected,during figure,re everyone,realized application,found future,pick help,dm apply,twitter
1000
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
annh1234
annh1234
4068.83424197063
3107.01904296875
5229.05615234375
0
2
324
0.021473
1E-06
0.002259
0
0
280
Annh1234
5/28/2015 9:49:39 PM
0
107
13508
0
False
False
False
False
True
False
t2_nru14
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Annh1234
5
2
2.1505376344086
1
1.0752688172043
0
0
35
37.6344086021505
93
RepliedTo
RepliedTo
webscraping
webscraping
keyword site1 site google select url specifically forgot looking exact
keyword site1 site google select url specifically forgot looking exact
site1,keyword forgot,exact certain,url indexed,cause supports,xpath test,case xpath,forgot one,looking keyword,indexed results,site1
site1,keyword forgot,exact certain,url indexed,cause supports,xpath test,case xpath,forgot one,looking keyword,indexed results,site1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
eugenequah
eugenequah
1
2925.51611328125
7313.77685546875
1
1
0
0.023961
0.153925
0.002133
0
1
281
EugeneQuah
2/25/2020 4:12:28 AM
0
3
1
0
False
False
False
False
True
False
t2_5rx22ksq
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/EugeneQuah
3
1
9.09090909090909
1
9.09090909090909
0
0
4
36.3636363636364
11
Commented
Commented
Octoparse_ideas
Octoparse_ideas
useful very access unfortunately webpage looks
useful very access unfortunately webpage looks
unfortunately,access access,webpage very,useful useful,unfortunately looks,very
unfortunately,access access,webpage very,useful useful,unfortunately looks,very
114.285714285714
https://styles.redditmedia.com/t5_571ac0/styles/profileIcon_snoo5f52dbc8-f253-4cb2-aa17-a1557a31109a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0a23a4a10985d689095620c8f114f164c4eb36a2
wannakeepmyanonymity
wannakeepmyanonymity
26.1100879133989
7597.076171875
3201.24682617188
3
2
2
0.00489
0
0.002882
0
0.5
282
wannakeepmyanonymity
10/17/2021 5:54:09 PM
0
554
476
0
False
False
False
False
True
False
t2_fkbmxz1t
False
False
False
https://styles.redditmedia.com/t5_571ac0/styles/profileIcon_snoo5f52dbc8-f253-4cb2-aa17-a1557a31109a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0a23a4a10985d689095620c8f114f164c4eb36a2
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/wannakeepmyanonymity
40
11
2.55813953488372
9
2.09302325581395
0
0
177
41.1627906976744
430
Posted RepliedTo
RepliedTo Posted
webscraping
webscraping
find octoparse impressum open click need issue xpath try entries
impressum click issue xpath try example websites collect expression regex
find,xpath loop,click google,maps example,impressum find,link try,run octoparse,find really,familiar back,find html,homepages
find,xpath example,impressum find,link loop,click google,maps try,run octoparse,find really,familiar back,find html,homepages
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
duhhuh
duhhuh
1
7597.076171875
3863.57373046875
1
1
0
0.00326
0
0.002217
0
1
283
duhhuh
1/16/2013 6:23:31 PM
0
381
19589
0
False
False
False
False
True
False
t2_a92sv
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/duhhuh
40
0
0
0
0
0
0
40
43.010752688172
93
Commented
Commented
webscraping
webscraping
href string var contains theimpressumlink impressum octoparse index regex boolean
href string var contains theimpressumlink impressum octoparse index regex boolean
contains,impressum language,function var,theimpressumlink returns,index var,alinks see,href href,contains getatrributevalue,href string,within within,string
contains,impressum language,function var,theimpressumlink returns,index var,alinks see,href href,contains getatrributevalue,href string,within within,string
100
https://styles.redditmedia.com/t5_m3trf/styles/profileIcon_0xo6t6aju5m61.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6332a377d5954fbb0b6afd7c0ae4a2c8dc5d6d46
sturmsignal
sturmsignal
1
7826.81884765625
3863.57373046875
0
1
0
0.00326
0
0.002217
0
0
284
sturmsignal
7/21/2018 1:28:45 PM
0
231
92
0
False
False
False
False
True
False
t2_1tntbek5
False
False
False
https://styles.redditmedia.com/t5_m3trf/styles/profileIcon_0xo6t6aju5m61.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6332a377d5954fbb0b6afd7c0ae4a2c8dc5d6d46
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/sturmsignal
40
1
2.63157894736842
0
0
0
0
15
39.4736842105263
38
Commented
Commented
webscraping
webscraping
contains node octoparse lead scraper impressum link one command extract
contains node octoparse lead scraper impressum link one command extract
way,access contains,impressum lead,scraper extract,link contains,command impressum,lead xpath,contains access,node node,extract command,contains
way,access contains,impressum lead,scraper extract,link contains,command impressum,lead xpath,contains access,node node,extract command,contains
185.714285714286
https://styles.redditmedia.com/t5_orgu5/styles/profileIcon_snoo5b88232f-ab72-48fc-9de3-38fcd440ccbd-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a78c475f5e92fda13a1c8bd6119bcf69d35414bb
elles_bells_
elles_bells_
151.660527480394
8872.51953125
9197.798828125
5
4
12
0.00978
0
0.003484
0
0.75
285
elles_bells_
9/19/2018 8:39:55 AM
0
169
143
0
False
False
False
False
True
False
t2_28k968p8
False
False
False
https://styles.redditmedia.com/t5_orgu5/styles/profileIcon_snoo5b88232f-ab72-48fc-9de3-38fcd440ccbd-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a78c475f5e92fda13a1c8bd6119bcf69d35414bb
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/elles_bells_
17
7
3.15315315315315
4
1.8018018018018
0
0
84
37.8378378378378
222
RepliedTo Posted
Posted RepliedTo
linguistics learnprogramming
learnprogramming linguistics
tweets thanks know python limited learning anyone distance corpus little
know distance barishasdemir kaggle python limited learning tweets thanks anyone
kaggle,barishasdemir barishasdemir,tweets distance,learning tweets,distance corpus,tweets little,limited forgive,essentially limited,python whole,python thanks,advice
kaggle,barishasdemir barishasdemir,tweets distance,learning tweets,distance corpus,tweets little,limited forgive,essentially limited,python whole,python thanks,advice
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
edwardsrk
edwardsrk
1
9026.0830078125
9927.7822265625
1
1
0
0.005589
0
0.002178
0
1
286
edwardsrk
11/12/2017 1:18:12 AM
0
10428
17529
0
False
False
False
False
True
False
t2_7wo3mdn
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/edwardsrk
17
3
5.88235294117647
2
3.92156862745098
0
0
17
33.3333333333333
51
Commented RepliedTo
Commented RepliedTo
linguistics
linguistics
yea twitter bother using api appeal none web corpora theyll
bother using api appeal none web corpora theyll languagetechnology help
hard,bother cross,onto api,none appeal,yoi none,datasets dozens,twitter yea,hard corpora,available bother,dozens languagetechnology,theyll
hard,bother cross,onto api,none appeal,yoi none,datasets dozens,twitter yea,hard corpora,available bother,dozens languagetechnology,theyll
100
https://styles.redditmedia.com/t5_2nmw6m/styles/profileIcon_snoo307c45b1-9ce6-4339-9a4f-ad77cb0a2acd-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=688979ee4a982be9515653d28f7a18d07134c7d7
sedulas
sedulas
1
9228.25
8882.67578125
1
1
0
0.005589
0
0.002178
0
1
287
Sedulas
5/12/2020 11:10:34 AM
0
12842
30797
0
False
False
False
False
True
False
t2_6g45m2tc
False
False
False
https://styles.redditmedia.com/t5_2nmw6m/styles/profileIcon_snoo307c45b1-9ce6-4339-9a4f-ad77cb0a2acd-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=688979ee4a982be9515653d28f7a18d07134c7d7
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Sedulas
17
1
1.36986301369863
0
0
0
0
37
50.6849315068493
73
Commented RepliedTo
Commented RepliedTo
linguistics
linguistics
similar wondering posted ones same link terms social thanks distance
ones same link terms social thanks distance comment find linkedin
similar,ones during,2020 learning,used link,wondering corpora,similar linkedin,similar data,social social,media part,pandemic seeing,link
similar,ones during,2020 learning,used link,wondering corpora,similar linkedin,similar data,social social,media part,pandemic seeing,link
1000
https://styles.redditmedia.com/t5_8a1ta7/styles/profileIcon_snooc1a50e77-a351-44fe-b150-da2ddcc7f17e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=70936652d96a0d35ea1ca552607b83929171650b
thisisprice
thisisprice
1582.93553854413
3507.69165039063
5297.35205078125
4
4
126
0.015302
0
0.00294
0
1
288
thisisprice
4/24/2023 12:11:30 AM
0
1
0
0
False
False
False
False
True
False
t2_5vpgs75l
False
False
False
https://styles.redditmedia.com/t5_8a1ta7/styles/profileIcon_snooc1a50e77-a351-44fe-b150-da2ddcc7f17e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=70936652d96a0d35ea1ca552607b83929171650b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/thisisprice
5
14
3.3096926713948
8
1.89125295508274
0
0
167
39.4799054373522
423
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
need scraping queries work know scraper gui without xpath data
need work scraper gui without scraping queries know supports explain
similar,data octoparse,scrapestorm ready,go required,skill idea,work topic,scrape such,software replicate,steps nerve,learn scrapestorm,support
octoparse,scrapestorm similar,data ready,go required,skill idea,work topic,scrape such,software replicate,steps nerve,learn scrapestorm,support
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
brownbottlecap
brownbottlecap
1
3705.529296875
5711.259765625
1
1
0
0.012925
0
0.002183
0
1
289
brownbottlecap
1/3/2017 7:33:50 AM
0
256
483
0
False
False
False
False
True
False
t2_143njg
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/brownbottlecap
5
0
0
0
0
0
0
2
66.6666666666667
3
Commented
Commented
webscraping
webscraping
ask chatgpt
ask chatgpt
ask,chatgpt
ask,chatgpt
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
kong_don
kong_don
1
3699.619140625
4916.947265625
1
1
0
0.012925
0
0.002183
0
1
290
Kong_Don
10/15/2021 7:09:02 AM
0
47
515
0
False
False
False
False
True
False
t2_fhyzfm8u
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Kong_Don
5
0
0
0
0
0
0
6
60
10
Commented
Commented
webscraping
webscraping
name scrape data website trying give
name scrape data website trying give
give,website trying,scrape data,trying name,data website,name
give,website trying,scrape data,trying name,data website,name
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
markuskruber
markuskruber
1
8487.1435546875
9927.7822265625
0
1
0
0.004347
0
0.002263
0
0
291
MarkusKruber
12/13/2016 3:03:22 AM
0
829
1218
0
False
False
False
False
True
False
t2_13j0an
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/MarkusKruber
18
2
12.5
0
0
0
0
5
31.25
16
RepliedTo
RepliedTo
consulting
consulting
nlp good work someone lot done answer
nlp good work someone lot done answer
someone,done good,answer work,nlp done,lot lot,work answer,someone
someone,done good,answer work,nlp done,lot lot,work answer,someone
142.857142857143
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
econofit
econofit
76.3302637401968
8081.26123046875
8467.814453125
2
1
6
0.00652
0
0.002526
0
0.5
292
econofit
3/8/2018 10:54:43 PM
0
1045
6478
0
False
False
False
False
True
False
t2_b6p12o
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/econofit
18
1
1.44927536231884
3
4.34782608695652
0
0
34
49.2753623188406
69
Commented
Commented
consulting
consulting
both words terms extraction look key umbrella tools soya offered
both words terms extraction look key umbrella tools soya offered
under,umbrella differ,single sauce,relevant key,words language,processing umbrella,natural concepts,fall both,those matching,soya natural,language
under,umbrella differ,single sauce,relevant key,words language,processing umbrella,natural concepts,fall both,those matching,soya natural,language
171.428571428571
https://styles.redditmedia.com/t5_5aei5j/styles/profileIcon_snooa89826ff-0658-4922-ad3c-cb66cb6b9086-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4ac6e72b828cb3c239c24b85bb469f6ee003b53a
mysterious-airline1
mysterious-airline1
126.550439566995
8152.1806640625
8535.9482421875
4
3
10
0.007824
0
0.003032
0
0.666666666666667
293
Mysterious-Airline1
11/5/2021 7:18:14 AM
0
1
1
0
False
False
False
False
True
False
t2_bmukppcl
False
False
False
https://styles.redditmedia.com/t5_5aei5j/styles/profileIcon_snooa89826ff-0658-4922-ad3c-cb66cb6b9086-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4ac6e72b828cb3c239c24b85bb469f6ee003b53a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Mysterious-Airline1
18
6
4.10958904109589
5
3.42465753424658
0
0
66
45.2054794520548
146
RepliedTo Posted
Posted RepliedTo
consulting
consulting
soya sauce dark competitor competitors product products thank price keyword
soya sauce dark competitor products price keyword competitors different conducting
soya,sauce sauce,competitor sauce,dark competitor,soya dark,soya differently,slight look,terms excellent,response variations,dark using,octoparse
soya,sauce sauce,competitor sauce,dark competitor,soya dark,soya differently,slight look,terms excellent,response variations,dark using,octoparse
100
https://styles.redditmedia.com/t5_3f05s5/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfM2I0NzdhNmIxYmUyMzY2MjhiMDg4MzllMWU4Y2Y4YmE4ZDkzNTg5YV82NTI5MjM2_rare_89ede328-266a-4584-beca-49323edb6a98-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=81cd38074bef182931d52bb2c0225dae63d99642
wonder-barr
wonder-barr
1
8300.990234375
9228.53125
0
1
0
0.00489
0
0.002187
0
0
294
Wonder-Barr
11/15/2020 3:40:42 PM
0
1
403
0
False
False
False
False
True
False
t2_7yz7bdg6
False
False
False
https://styles.redditmedia.com/t5_3f05s5/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfM2I0NzdhNmIxYmUyMzY2MjhiMDg4MzllMWU4Y2Y4YmE4ZDkzNTg5YV82NTI5MjM2_rare_89ede328-266a-4584-beca-49323edb6a98-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=81cd38074bef182931d52bb2c0225dae63d99642
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Wonder-Barr
18
2
3.44827586206897
1
1.72413793103448
0
0
20
34.4827586206897
58
RepliedTo
RepliedTo
consulting
consulting
mturk subset figure service take yes looking similar sexy contractors
mturk subset figure service take yes looking similar sexy contractors
farm,bunch looking,subset really,wanted subset,products quick,dirty dirty,way contractors,mturk long,figure wanted,farm yes,take
farm,bunch looking,subset really,wanted subset,products quick,dirty dirty,way contractors,mturk long,figure wanted,farm yes,take
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
privateequityadvisor
privateequityadvisor
1
7775.68212890625
8963.0458984375
1
1
0
0.00489
0
0.002187
0
1
295
PrivateEquityAdvisor
12/28/2021 11:18:10 PM
0
2
3724
0
False
False
False
False
True
False
t2_i0vsxol2
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/PrivateEquityAdvisor
18
0
0
0
0
0
0
8
72.7272727272727
11
Commented
Commented
consulting
consulting
product instead name grab sku code
product instead name grab sku code
grab,product code,sku product,code product,name instead,product sku,instead name,grab
grab,product code,sku product,code product,name instead,product sku,instead name,grab
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
devcorp101
devcorp101
1
9374.9892578125
856.395812988281
0
1
0
0.002445
0
0.002269
0
0
296
devcorp101
2/25/2021 5:12:53 PM
0
1
0
0
False
False
False
False
True
False
t2_akitzbx6
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/devcorp101
55
1
5.88235294117647
0
0
0
0
5
29.4117647058824
17
Commented
Commented
scrapy
scrapy
dm spreadsheet help format please happy
dm spreadsheet help format please happy
help,please dm,format please,dm format,spreadsheet happy,help
help,please dm,format please,dm format,spreadsheet happy,help
100
https://styles.redditmedia.com/t5_efsef/styles/profileIcon_snoo4cf5047e-3e60-4ce6-a84f-33745387060b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=33113960dfa9eaacf13c196e3ecfed7e81fcdf1c
rising_gmni
rising_gmni
1
9374.9892578125
332.943908691406
2
1
0
0.002445
0
0.002609
0
0
297
rising_gmni
1/14/2017 4:17:37 PM
0
2695
899
0
False
False
False
False
True
False
t2_14fkps
False
False
False
https://styles.redditmedia.com/t5_efsef/styles/profileIcon_snoo4cf5047e-3e60-4ce6-a84f-33745387060b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=33113960dfa9eaacf13c196e3ecfed7e81fcdf1c
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/rising_gmni
55
1
0.934579439252336
0
0
0
0
59
55.1401869158879
107
Posted
Posted
scrapy
scrapy
louey single availability php data table org webpage daily rows
louey single availability php data table org webpage daily rows
availability,php louey,org org,availability services,octoparse scraped,daily exceeds,amount format,webpage webpage,consists 165,words subscription,scheduled
availability,php louey,org org,availability services,octoparse scraped,daily exceeds,amount format,webpage webpage,consists 165,words subscription,scheduled
100
https://styles.redditmedia.com/t5_4x65wn/styles/profileIcon_9ollfs3cdai71.png?width=256&height=256&crop=256:256,smart&v=enabled&s=db86164e7f30bc01e368f9e06dffb7d3e2abb081
octoparsede
octoparsede
1
1451.29711914063
4965.671875
1
1
0
0
0
0.002439
0
0
298
OctoparseDe
8/19/2021 1:52:39 AM
0
1
0
0
False
False
False
False
True
False
t2_e067m8bd
False
False
True
https://styles.redditmedia.com/t5_4x65wn/styles/profileIcon_9ollfs3cdai71.png?width=256&height=256&crop=256:256,smart&v=enabled&s=db86164e7f30bc01e368f9e06dffb7d3e2abb081
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/OctoparseDe
1
1
1.63934426229508
3
4.91803278688525
0
0
34
55.7377049180328
61
Posted
Posted
WebScrapingDe
WebScrapingDe
scraping web die beliebtesten anwendungen 25 blog octoparse vorgestellt geschäft
scraping web die beliebtesten anwendungen 25 blog octoparse vorgestellt geschäft
web,scraping beliebtesten,anwendungen 25,beliebtesten anwendungen,web die,25 octoparse,blog blog,die csv,excel versteht,man scraping,octoparse
web,scraping beliebtesten,anwendungen 25,beliebtesten anwendungen,web die,25 octoparse,blog blog,die csv,excel versteht,man scraping,octoparse
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
dan_automation_man
dan_automation_man
1
503.528503417969
4965.671875
1
1
0
0
0
0.002439
0
0
299
Dan_Automation_Man
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/Dan_Automation_Man
1
0
0
0
0
0
0
2
66.6666666666667
3
Posted
Posted
Automation_Central
Automation_Central
coding automation
coding automation
coding,automation
coding,automation
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
exvancouverite
exvancouverite
1
9613.625
6630.39111328125
0
1
0
0.00326
0
0.002217
0
0
300
exvancouverite
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/exvancouverite
39
0
0
0
0
0
0
5
45.4545454545455
11
Commented
Commented
aws
aws
documentation simple case plenty use
documentation simple case plenty use
plenty,documentation documentation,simple simple,use use,case
plenty,documentation documentation,simple simple,use use,case
114.285714285714
https://styles.redditmedia.com/t5_cj79f/styles/profileIcon_snoo721f25fe-1e7d-4172-8564-952a823de3e9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=002e18b2533b960771dc8a07f1d14bbbafdc4555
clarkeyyyy
clarkeyyyy
26.1100879133989
9613.625
5989.4296875
3
1
2
0.00489
0
0.002882
0
0
301
Clarkeyyyy
1/27/2017 4:25:15 PM
0
1909
411
0
False
False
False
False
True
False
t2_14ttau
False
False
False
https://styles.redditmedia.com/t5_cj79f/styles/profileIcon_snoo721f25fe-1e7d-4172-8564-952a823de3e9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=002e18b2533b960771dc8a07f1d14bbbafdc4555
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Clarkeyyyy
39
3
4.6875
0
0
0
0
23
35.9375
64
Posted
Posted
aws
aws
ideally anyone using something s3 hey scraping talk boss files
ideally anyone using something s3 hey scraping talk boss files
json,files upload,s3 working,project coding,working automate,anyone websites,json buckets,day scraping,websites everyone,fairly someone,talk
json,files upload,s3 working,project coding,working automate,anyone websites,json buckets,day scraping,websites everyone,fairly someone,talk
100
https://styles.redditmedia.com/t5_215ril/styles/profileIcon_snoo0703e319-7edb-45db-8bad-acc3c0243bc8-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=942525bb073df86f8787146a24831a05477f3c7d
cr125rider
cr125rider
1
9850.7783203125
6630.39111328125
0
1
0
0.00326
0
0.002217
0
0
302
cr125rider
2/1/2010 6:45:46 PM
0
484
177653
0
False
False
False
False
True
False
t2_3uyqr
False
False
False
https://styles.redditmedia.com/t5_215ril/styles/profileIcon_snoo0703e319-7edb-45db-8bad-acc3c0243bc8-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=942525bb073df86f8787146a24831a05477f3c7d
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/cr125rider
39
0
0
0
0
0
0
11
47.8260869565217
23
Commented
Commented
aws
aws
yelling level read seconds stuff try expert started getting took
yelling level read seconds stuff try expert started getting took
level,stuff getting,started yelling,expert seconds,read expert,level doc,try read,getting started,doc stuff,took took,seconds
level,stuff getting,started yelling,expert seconds,read expert,level doc,try read,getting started,doc stuff,took took,seconds
1000
https://styles.redditmedia.com/t5_82b104/styles/profileIcon_snoo2f29d132-34e7-445d-9af4-d54162fbaf02-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=30264c8df142984fb82b282ba2668a77aa5cb51a
dmitriysokol
dmitriysokol
2788.21975838728
2860.75927734375
1508.47045898438
5
2
222
0.019176
0
0.002946
0
0.25
303
DmitriySokol
3/15/2023 7:09:20 PM
0
10
9
0
False
False
False
False
True
False
t2_65e0upd1d
False
False
False
https://styles.redditmedia.com/t5_82b104/styles/profileIcon_snoo2f29d132-34e7-445d-9af4-d54162fbaf02-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=30264c8df142984fb82b282ba2668a77aa5cb51a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/DmitriySokol
4
6
2.57510729613734
8
3.43347639484979
0
0
100
42.9184549356223
233
Posted RepliedTo
RepliedTo Posted
webscraping
webscraping
scrape tutorials information program helpful use attempting octoparse scraping workflow
use scraping make fails attempt reason issue scrapes communities lack
chatgpt,perspective perspective,numerous rarely,detects seemingly,fails attempt,make adequate,tutorials investors,developers information,sometimes groups,time make,scrape
rarely,detects seemingly,fails attempt,make adequate,tutorials investors,developers information,sometimes groups,time make,scrape use,effectively detect,relevant
100
https://styles.redditmedia.com/t5_6e3mv8/styles/profileIcon_snoo51422af7-c833-4b1d-a06c-8c29585723ae-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=feab9899eda898edfce7729b046810e3b9a9edcb
gordongekko16
gordongekko16
1
2721.58251953125
1082.10778808594
0
1
0
0.013971
0
0.002162
0
0
304
gordongekko16
5/19/2022 12:23:03 AM
0
11
8
0
False
False
False
False
True
False
t2_gjgjmlwx
False
False
False
https://styles.redditmedia.com/t5_6e3mv8/styles/profileIcon_snoo51422af7-c833-4b1d-a06c-8c29585723ae-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=feab9899eda898edfce7729b046810e3b9a9edcb
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/gordongekko16
4
1
5.88235294117647
0
0
0
0
8
47.0588235294118
17
Commented
Commented
webscraping
webscraping
provider good build using proxy octoparse scrapers avoid find
provider good build using proxy octoparse scrapers avoid find
build,scrapers proxy,provider find,good using,octoparse scrapers,find avoid,using good,proxy octoparse,build
build,scrapers proxy,provider find,good using,octoparse scrapers,find avoid,using good,proxy octoparse,build
1000
https://styles.redditmedia.com/t5_2j6hln/styles/profileIcon_snooe174fd37-d62e-41a3-a62e-38123875f542-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0822f7c05c2c44fa146420f382b43aaacc3e9e52
gullibleengineer4
gullibleengineer4
2411.5684396863
3074.88989257813
1307.85876464844
1
2
192
0.018111
0
0.00224
0
0.5
305
GullibleEngineer4
3/30/2020 4:22:16 PM
0
634
2849
0
False
False
False
False
True
False
t2_5thp9t8q
False
False
False
https://styles.redditmedia.com/t5_2j6hln/styles/profileIcon_snooe174fd37-d62e-41a3-a62e-38123875f542-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0822f7c05c2c44fa146420f382b43aaacc3e9e52
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/GullibleEngineer4
4
12
4.65116279069767
10
3.87596899224806
0
0
88
34.1085271317829
258
RepliedTo Commented
Commented RepliedTo
webscraping
webscraping
end mistake trial support multiple use octoparse scam services lot
trial code service problem support multiple use octoparse scam services
mistake,end contacting,support someone,write lot,services digital,ocean against,cc mistakes,made provide,lot scam,course go,beyond
contacting,support mistake,end someone,write lot,services digital,ocean against,cc mistakes,made provide,lot scam,course go,beyond
100
https://styles.redditmedia.com/t5_3fd138/styles/profileIcon_snoo2ef550f7-7728-4b2b-89bf-423af955acec-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=eb7b50ead9cba89c1166a4dd1fc5d30aa7201fc9
arandomboiisme
arandomboiisme
1
7305.08056640625
3674.84619140625
0
1
0
0.004401
0
0.002192
0
0
306
ARandomBoiIsMe
11/17/2020 10:39:14 AM
0
79
3234
0
False
False
False
False
True
False
t2_8xbtxrww
False
False
True
https://styles.redditmedia.com/t5_3fd138/styles/profileIcon_snoo2ef550f7-7728-4b2b-89bf-423af955acec-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=eb7b50ead9cba89c1166a4dd1fc5d30aa7201fc9
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ARandomBoiIsMe
29
9
6.56934306569343
1
0.72992700729927
0
0
58
42.3357664233577
137
Commented
Commented
webscraping
webscraping
scrape very pretty scraping few watson depending john version recently
scrape very pretty scraping few watson depending john version recently
depending,complexity extension,called very,basic sometimes,code pretty,limited recommend,check explains,really few,days time,effort extra,time
depending,complexity extension,called very,basic sometimes,code pretty,limited recommend,check explains,really few,days time,effort extra,time
142.857142857143
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
strokesite
strokesite
76.3302637401968
7305.08056640625
2976.91015625
4
1
6
0.007335
0
0.003179
0
0
307
Strokesite
6/12/2018 9:02:40 PM
0
1002
11059
0
False
False
False
False
True
False
t2_euy3nyn
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Strokesite
29
8
5.97014925373134
1
0.746268656716418
0
0
53
39.5522388059701
134
Posted
Posted
webscraping
webscraping
platforms sales anyone scraping websites becoming look considering octoparse write
platforms sales anyone scraping websites becoming look considering octoparse write
look,easy anyone,enlighten numbers,maybe names,addresses money,budget enlighten,web leads,ve beginning,think platforms,overly wonder,anyone
look,easy anyone,enlighten numbers,maybe names,addresses money,budget enlighten,web leads,ve beginning,think platforms,overly wonder,anyone
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
ajt9000
ajt9000
1
7010.12060546875
3674.84619140625
0
1
0
0.004401
0
0.002192
0
0
308
ajt9000
3/5/2019 6:47:30 AM
0
873
163551
0
False
False
False
False
True
False
t2_3cjsjr9m
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ajt9000
29
1
1.85185185185185
1
1.85185185185185
0
0
21
38.8888888888889
54
Commented
Commented
webscraping
webscraping
lot way many scrape interact programmatically care sites public making
lot way many scrape interact programmatically care sites public making
lot,effort public,apis site,made apis,interact offer,public put,lot easy,controlled sites,really effort,making official,way
lot,effort public,apis site,made apis,interact offer,public put,lot easy,controlled sites,really effort,making official,way
100
https://styles.redditmedia.com/t5_56s3b0/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18yNTQ2NjI0_rare_49b6005b-e786-4ac6-b417-07740e50ad89-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3337c60913db13b22f97be46f24b7652c83c8287
old_flounder_8640
old_flounder_8640
1
7010.12060546875
2976.91015625
0
1
0
0.004401
0
0.002192
0
0
309
Old_Flounder_8640
10/16/2021 8:40:58 AM
0
1
-46
0
False
False
False
False
True
False
t2_fjbbff0e
False
False
True
https://styles.redditmedia.com/t5_56s3b0/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18yNTQ2NjI0_rare_49b6005b-e786-4ac6-b417-07740e50ad89-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3337c60913db13b22f97be46f24b7652c83c8287
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Old_Flounder_8640
29
0
0
0
0
0
0
7
46.6666666666667
15
Commented
Commented
webscraping
webscraping
python learned term try long mkt originally
python learned term try long mkt originally
try,long mkt,try originally,mkt long,term learned,python python,originally
try,long mkt,try originally,mkt long,term learned,python python,originally
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
no-crew-4297
no-crew-4297
1
2890.318359375
6833.30712890625
0
1
0
0.023961
0.153925
0.002133
0
0
310
No-Crew-4297
11/26/2020 5:51:13 AM
0
1
0
0
False
False
False
False
True
False
t2_91nlgpbk
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/No-Crew-4297
3
2
3.50877192982456
0
0
0
0
23
40.3508771929825
57
Commented
Commented
Octoparse_ideas
Octoparse_ideas
web scrape data website look scraping against long anything phone
web scrape data website look scraping against long anything phone
scrape,data web,scrape look,robots convenience,stores data,long phone,number file,know address,phone anything,against against,web
scrape,data web,scrape look,robots convenience,stores data,long phone,number file,know address,phone anything,against against,web
100
https://styles.redditmedia.com/t5_hf3rh/styles/profileIcon_snoo94c89707-a525-4928-b107-16470ef5d707-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=216dfa6cf3a7791e9c6a75675ab949d1bf540074
warrior_321
warrior_321
1
7059.19287109375
2096.50366210938
0
1
0
0.004401
0
0.002192
0
0
311
warrior_321
3/29/2018 3:47:35 PM
0
99
998
0
False
False
False
False
True
False
t2_1407l36r
False
False
False
https://styles.redditmedia.com/t5_hf3rh/styles/profileIcon_snoo94c89707-a525-4928-b107-16470ef5d707-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=216dfa6cf3a7791e9c6a75675ab949d1bf540074
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/warrior_321
28
0
0
0
0
0
0
5
62.5
8
Commented
Commented
webscraping
webscraping
webscraper try plugin io
webscraper try plugin io
webscraper,io webscraper,plugin try,webscraper plugin,webscraper
webscraper,io webscraper,plugin try,webscraper plugin,webscraper
100
https://styles.redditmedia.com/t5_c81jc/styles/profileIcon_snoo4bc05c65-861c-4eff-ae79-8655a01f2aee-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9b1a2eaa66265376bc24916858880975f3e37df4
thegrif
thegrif
1
6862.6396484375
2556.72436523438
1
1
0
0.004401
0
0.002192
0
1
312
thegrif
7/4/2009 6:13:56 AM
0
326
1175
0
False
False
False
False
True
False
t2_3ja6m
False
False
False
https://styles.redditmedia.com/t5_c81jc/styles/profileIcon_snoo4bc05c65-861c-4eff-ae79-8655a01f2aee-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=9b1a2eaa66265376bc24916858880975f3e37df4
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/thegrif
28
0
0
0
0
0
0
13
68.4210526315789
19
Commented
Commented
webscraping
webscraping
gist github tournament thegrif 77b5b01baa56f0ddb2d4eece0b80d1e2 964c3069e350ce427954a90d19a0f2fb listing details data
gist github tournament thegrif 77b5b01baa56f0ddb2d4eece0b80d1e2 964c3069e350ce427954a90d19a0f2fb listing details data
github,thegrif gist,github listing,gist thegrif,964c3069e350ce427954a90d19a0f2fb details,gist thegrif,77b5b01baa56f0ddb2d4eece0b80d1e2 tournament,listing 964c3069e350ce427954a90d19a0f2fb,tournament tournament,details data,tournament
github,thegrif gist,github listing,gist thegrif,964c3069e350ce427954a90d19a0f2fb details,gist thegrif,77b5b01baa56f0ddb2d4eece0b80d1e2 tournament,listing 964c3069e350ce427954a90d19a0f2fb,tournament tournament,details data,tournament
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
joyisbrightcolors
joyisbrightcolors
1
6168.68212890625
3696.396484375
0
1
0
0.006792
0
0.002168
0
0
313
joyisbrightcolors
12/5/2018 12:10:27 AM
0
1
1
0
False
False
False
False
True
False
t2_2q2exygx
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/joyisbrightcolors
14
0
0
0
0
0
0
3
23.0769230769231
13
Commented
Commented
scrapinghub
scrapinghub
starting far use
starting far use
starting,use use,far
starting,use use,far
242.857142857143
https://styles.redditmedia.com/t5_carzf/styles/profileIcon_mcbxqxp0wd411.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=a9f2333591bbeaa6670f29162fe212c477ff0e86
ankerstein17
ankerstein17
252.100879133989
5964.4287109375
4560.02978515625
6
1
20
0.012225
0
0.003794
0
0
314
Ankerstein17
4/28/2016 4:09:24 PM
0
60
24
0
False
False
False
False
True
False
t2_xj3q9
False
False
False
https://styles.redditmedia.com/t5_carzf/styles/profileIcon_mcbxqxp0wd411.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=a9f2333591bbeaa6670f29162fe212c477ff0e86
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Ankerstein17
14
1
2
1
2
0
0
24
48
50
Posted
Posted
scrapinghub
scrapinghub
using #x200b people scraping reason web ask scrape data learn
using #x200b people scraping reason web ask scrape data learn
#x200b,currently learn,others people,using data,#x200b #x200b,reason octoparse,#x200b reason,ask using,scraping hey,fam more,people
#x200b,currently learn,others people,using data,#x200b #x200b,reason octoparse,#x200b reason,ask using,scraping hey,fam more,people
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
pablohoffman
pablohoffman
1
5653.1552734375
4841.3779296875
0
1
0
0.006792
0
0.002168
0
0
315
pablohoffman
5/13/2008 3:55:47 PM
0
130
33
0
False
False
False
False
True
False
t2_3592a
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/pablohoffman
14
0
0
0
0
0
0
10
47.6190476190476
21
Commented
Commented
scrapinghub
scrapinghub
scraping became used octopart understand choice curious services tool compared
scraping became used octopart understand choice curious services tool compared
understand,became tool,choice became,tool curious,understand octopart,services compared,octopart choice,scraping used,compared services,curious
understand,became tool,choice became,tool curious,understand octopart,services compared,octopart choice,scraping used,compared services,curious
100
https://styles.redditmedia.com/t5_1unsyk/styles/profileIcon_snoo19e30ebc-4fc0-42c6-b4d4-dd50b287ec4c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=533a37fbd4cb8bf755277984bfb28bbde412e560
triggaztilt
triggaztilt
1
5945.2138671875
5676.0703125
0
1
0
0.006792
0
0.002168
0
0
316
TriggazTilt
6/1/2013 11:15:52 PM
0
842
3673
0
False
False
False
False
True
False
t2_bw4sq
False
False
True
https://styles.redditmedia.com/t5_1unsyk/styles/profileIcon_snoo19e30ebc-4fc0-42c6-b4d4-dd50b287ec4c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=533a37fbd4cb8bf755277984bfb28bbde412e560
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/TriggazTilt
14
0
0
0
0
0
0
7
63.6363636363636
11
Commented
Commented
scrapinghub
scrapinghub
projects small scrapy larger webdriver selenium
projects small scrapy larger webdriver selenium
scrapy,larger selenium,webdriver small,projects webdriver,scrapy projects,selenium larger,projects
scrapy,larger selenium,webdriver small,projects webdriver,scrapy projects,selenium larger,projects
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
rugantio
rugantio
1
5791.2666015625
3617.87182617188
0
1
0
0.006792
0
0.002168
0
0
317
rugantio
12/30/2018 3:02:45 PM
0
9
1
0
False
False
False
False
True
False
t2_2k5cg8fd
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/rugantio
14
1
4.16666666666667
0
0
0
0
14
58.3333333333333
24
Commented
Commented
scrapinghub
scrapinghub
selenium bs4 parsing big websites dynamic everything lxml requests crawling
selenium bs4 parsing big websites dynamic everything lxml requests crawling
requests,single single,scrapy lxml,bs4 everything,python bs4,parsing python,requests selenium,dynamic big,projects websites,lxml scrapy,recursive
requests,single single,scrapy lxml,bs4 everything,python bs4,parsing python,requests selenium,dynamic big,projects websites,lxml scrapy,recursive
100
https://styles.redditmedia.com/t5_1s3qs0/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYmZkNjcwNjY3MDUzZTUxN2E5N2FmZTU2YzkxZTRmODNmMTE2MGJkM18xOTMxNDM_rare_4a2e22d5-d1da-464c-9dfa-5f1b1945ccda-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6ae5b3efb51943bbe855ce607de3765a40628ef0
rollindeepwithdata
rollindeepwithdata
1
6263.82666015625
4968.43310546875
0
1
0
0.006792
0
0.002168
0
0
318
RollinDeepWithData
1/24/2014 10:57:26 PM
0
3582
115786
0
False
False
False
False
True
False
t2_eyq19
False
False
True
https://styles.redditmedia.com/t5_1s3qs0/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYmZkNjcwNjY3MDUzZTUxN2E5N2FmZTU2YzkxZTRmODNmMTE2MGJkM18xOTMxNDM_rare_4a2e22d5-d1da-464c-9dfa-5f1b1945ccda-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6ae5b3efb51943bbe855ce607de3765a40628ef0
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/RollinDeepWithData
14
1
2.7027027027027
3
8.10810810810811
0
0
13
35.1351351351351
37
Commented
Commented
scrapinghub
scrapinghub
using considering time rvest soup scrape learn rselenium moving invested
using considering time rvest soup scrape learn rselenium moving invested
invested,time learn,scrape worry,sunk rselenium,considering time,learn moving,beautiful sunk,cost soup,idk considering,moving using,rvest
invested,time learn,scrape worry,sunk rselenium,considering time,learn moving,beautiful sunk,cost soup,idk considering,moving using,rvest
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
zumtest99
zumtest99
1
5603.63134765625
7939.27587890625
0
1
0
0.012868
0
0.002144
0
0
319
zumtest99
10/16/2016 6:07:04 PM
0
20
53
0
False
False
False
False
True
False
t2_1261o8
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/zumtest99
8
0
0
0
0
0
0
2
66.6666666666667
3
Commented
Commented
selfhosted
selfhosted
offers yacy
offers yacy
yacy,offers
yacy,offers
742.857142857143
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
sinclairxer
sinclairxer
1130.95395610295
5215.87109375
8489.1220703125
11
6
90
0.02445
2E-06
0.005213
0
0.5
320
Sinclairxer
10/4/2021 5:30:48 AM
0
37
68
0
False
False
False
False
True
False
t2_f3rxbg7d
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Sinclairxer
8
10
7.63358778625954
0
0
0
0
59
45.0381679389313
131
RepliedTo Posted
Posted RepliedTo
selfhosted
selfhosted
something very thank much similar anyone know web octoparse try
something very much similar anyone know web octoparse try thank
thank,very very,much something,similar similar,octoparse anyone,know interesting,thank much,help similar,try scraper,something free,selfhosted
thank,very very,much something,similar similar,octoparse anyone,know interesting,thank much,help similar,try scraper,something free,selfhosted
100
https://styles.redditmedia.com/t5_3n1d1c/styles/profileIcon_snoof462d6f4-e3a1-4384-ac77-5317da810c5a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=539be55f4e5e8cc0c3d7e7a0194df2b277e6a62a
ok-until-you-arrived
ok-until-you-arrived
1
5623.51123046875
8828.4013671875
0
1
0
0.012868
0
0.002144
0
0
321
ok-until-you-arrived
12/29/2020 8:28:38 PM
0
136
763
0
False
False
False
False
True
False
t2_9k4q8sob
False
False
False
https://styles.redditmedia.com/t5_3n1d1c/styles/profileIcon_snoof462d6f4-e3a1-4384-ac77-5317da810c5a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=539be55f4e5e8cc0c3d7e7a0194df2b277e6a62a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ok-until-you-arrived
8
0
0
0
0
0
0
2
25
8
Commented
Commented
selfhosted
selfhosted
think huginn
think huginn
think,huginn
think,huginn
100
https://styles.redditmedia.com/t5_1dyzj6/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMTQ4MjYx_rare_c4baa0b6-a2a0-4d0b-9fd2-f1373adc6429-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6a2d386df2d0921bcd7ca26cefe128d608844c17
alfagun74
alfagun74
1
4944.0263671875
7390.39892578125
0
1
0
0.012868
0
0.002144
0
0
322
Alfagun74
5/6/2016 1:45:28 PM
0
14436
3108
0
False
False
False
False
True
False
t2_xqgrl
False
False
True
https://styles.redditmedia.com/t5_1dyzj6/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8xMTQ4MjYx_rare_c4baa0b6-a2a0-4d0b-9fd2-f1373adc6429-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6a2d386df2d0921bcd7ca26cefe128d608844c17
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Alfagun74
8
0
0
0
0
0
0
4
66.6666666666667
6
Commented
Commented
selfhosted
selfhosted
simply wget cronjobs use
simply wget cronjobs use
wget,cronjobs simply,use use,wget
wget,cronjobs simply,use use,wget
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
alx_xl
alx_xl
1
4828.09130859375
9039
0
1
0
0.012868
0
0.002144
0
0
323
Alx_xl
5/13/2014 11:11:51 PM
0
13
4296
0
False
False
False
False
True
False
t2_gjzeb
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Alx_xl
8
1
14.2857142857143
0
0
0
0
3
42.8571428571429
7
Commented
Commented
selfhosted
selfhosted
node something work red
node something work red
red,work something,node node,red
red,work something,node node,red
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
allchoppedup
allchoppedup
1
3402.064453125
71.2179489135742
2
2
0
0.010295
0
0.00243
0
1
324
allchoppedup
10/17/2022 12:31:29 AM
0
258
704
0
False
False
False
False
True
False
t2_tfbaprw2
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/allchoppedup
4
4
4.16666666666667
1
1.04166666666667
0
0
47
48.9583333333333
96
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
sales websites text issues zapier octoparse way think awesome one
sales text issues zapier octoparse way think awesome one selected
hire,professional professional,scraper json,python coder,ve 30,today one,small ve,trying coming,way obviously,issues think,30
hire,professional professional,scraper json,python coder,ve 30,today one,small ve,trying coming,way obviously,issues think,30
371.428571428571
https://styles.redditmedia.com/t5_2bw1n8/styles/profileIcon_snoo40bcea19-05aa-49fb-b9ab-209fb7ee71b6-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3acc83f2ff9e45a4fb3c43c4fe370697e7bdad3b
trafalgardxlaw
trafalgardxlaw
478.09167035458
3331.54248046875
528.910583496094
2
2
38
0.012868
0
0.002334
0
1
325
trafalgarDxlaw
1/3/2020 9:16:44 AM
0
10104
2214
0
False
False
False
False
True
False
t2_5cz8cw46
False
False
True
https://styles.redditmedia.com/t5_2bw1n8/styles/profileIcon_snoo40bcea19-05aa-49fb-b9ab-209fb7ee71b6-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3acc83f2ff9e45a4fb3c43c4fe370697e7bdad3b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/trafalgarDxlaw
4
2
2.27272727272727
1
1.13636363636364
0
0
39
44.3181818181818
88
Commented RepliedTo
RepliedTo Commented
webscraping
webscraping
website each structure avoid need scam bank never enough custom
website each structure avoid need scam bank never enough custom
each,website structure,need code,each use,paypal information,judge btw,user add,custom structure,well data,specific using,bank
each,website structure,need code,each use,paypal information,judge btw,user add,custom structure,well data,specific using,bank
100
https://styles.redditmedia.com/t5_2mbfdo/styles/profileIcon_40kr1xbtb6w41.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=65db943dd6114434995073af17730cc78f67c579
remcoe33
remcoe33
1
4778.650390625
449.410308837891
0
1
0
0.00815
0
0.002203
0
0
326
RemcoE33
4/29/2020 8:04:04 AM
0
37
2573
0
False
False
False
False
True
False
t2_6c345ft6
False
False
True
https://styles.redditmedia.com/t5_2mbfdo/styles/profileIcon_40kr1xbtb6w41.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=65db943dd6114434995073af17730cc78f67c579
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/RemcoE33
9
1
4.54545454545455
0
0
0
0
10
45.4545454545455
22
Commented
Commented
googlesheets
googlesheets
authorization octopase api# advanced tutorial endpoints api methods depends octoparse
authorization octopase api# advanced tutorial endpoints api methods depends octoparse
depends,api authorization,endpoints octoparse,tutorial endpoints,methods octopase,find find,authorization api#,octopase api,octoparse advanced,api# tutorial,advanced
depends,api authorization,endpoints octoparse,tutorial endpoints,methods octopase,find find,authorization api#,octopase api,octoparse advanced,api# tutorial,advanced
228.571428571429
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
real_strategy3314
real_strategy3314
226.990791220591
4445.80517578125
560.91259765625
3
1
18
0.011643
0
0.002602
0
0
327
Real_Strategy3314
1/18/2021 12:22:35 AM
0
9
65
0
False
False
False
False
True
False
t2_76q1qro2
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Real_Strategy3314
9
2
3.44827586206897
2
3.44827586206897
0
0
24
41.3793103448276
58
Posted
Posted
googlesheets webscraping
webscraping googlesheets
super pull getting data connector api hi confused extension tutorials
super pull getting data connector api hi confused extension tutorials
octoparse,tutorials getting,super data,automatically pull,data connector,pull super,confused automatically,octoparse tutorials,coz hi,use extension,api
octoparse,tutorials getting,super data,automatically pull,data connector,pull super,confused automatically,octoparse tutorials,coz hi,use extension,api
100
https://styles.redditmedia.com/t5_6815m7/styles/profileIcon_ugpk99rvbnt81.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c92a8dbf8cd7b7dcac0adbae1c2fd08ebd070118
syntaxtechnologies17
syntaxtechnologies17
1
819.451293945313
4965.671875
1
1
0
0
0
0.002439
0
0
328
syntaxtechnologies17
4/15/2022 7:32:42 AM
0
1
0
0
False
False
False
False
True
False
t2_lyo6uzct
False
False
True
https://styles.redditmedia.com/t5_6815m7/styles/profileIcon_ugpk99rvbnt81.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c92a8dbf8cd7b7dcac0adbae1c2fd08ebd070118
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/syntaxtechnologies17
1
24
3.49344978165939
7
1.01892285298399
0
0
346
50.3639010189229
687
Posted
Posted
u_syntaxtechnologies17
u_syntaxtechnologies17
data analyst analysis portfolio analytics projects skills questions position sets
data analyst analysis portfolio analytics projects skills questions position sets
data,analyst data,analytics data,analysis data,sets analyst,skills projects,include ideas,data include,portfolio public,data exploratory,data
data,analyst data,analytics data,analysis data,sets analyst,skills projects,include ideas,data include,portfolio public,data exploratory,data
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
annahrytsyk
annahrytsyk
1
187.605651855469
4304.056640625
1
1
0
0
0
0.002439
0
0
329
AnnaHrytsyk
10/20/2021 5:40:59 PM
0
1
0
0
False
False
False
False
True
False
t2_foxhriwp
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/AnnaHrytsyk
1
Posted
Posted
poland
poland
100
https://styles.redditmedia.com/t5_15muv0/styles/profileIcon_snoo59b492db-a198-4957-b0cc-2a83d4fa4cb6-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f99c735dc251ed3b69458f54e619106d5c085d38
asimrazajalbani
asimrazajalbani
1
503.528503417969
4304.056640625
1
1
0
0
0
0.002439
0
0
330
AsimRazaJalbani
6/5/2017 3:00:07 PM
0
7266
146
0
False
False
False
False
True
False
t2_2sn1c8v
False
False
True
https://styles.redditmedia.com/t5_15muv0/styles/profileIcon_snoo59b492db-a198-4957-b0cc-2a83d4fa4cb6-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f99c735dc251ed3b69458f54e619106d5c085d38
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/AsimRazaJalbani
1
Posted
Posted
programmingtools
programmingtools
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
chuck_you
chuck_you
1
1767.21984863281
4965.671875
1
1
0
0
0
0.002439
0
0
331
Chuck_You
4/21/2019 8:04:47 PM
0
222
396
0
False
False
False
False
True
False
t2_3n17eou1
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Chuck_You
1
3
3.15789473684211
0
0
0
0
40
42.1052631578947
95
Posted
Posted
webscraping
webscraping
gallery images size full option clicked better results wanted save
gallery images size full option clicked better results wanted save
better,option beginner,friendly thumbnail,size clicked,full opens,goes resolution,wanted basically,results friendly,thanks option,more number,search
better,option beginner,friendly thumbnail,size clicked,full opens,goes resolution,wanted basically,results friendly,thanks option,more number,search
100
https://styles.redditmedia.com/t5_drwnp/styles/profileIcon_snoo94654f08-c8a3-4b4e-b759-32e13d06305f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ff5eb9b0412eeaccf1d1b999046a32b0e7a5a40b
bartoncls
bartoncls
1
2284.99365234375
7980.54052734375
1
1
0
0.023961
0.153925
0.002133
0
1
332
bartoncls
4/1/2016 6:50:10 AM
0
894
5506
0
False
False
False
False
True
False
t2_wsrjq
False
False
True
https://styles.redditmedia.com/t5_drwnp/styles/profileIcon_snoo94654f08-c8a3-4b4e-b759-32e13d06305f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ff5eb9b0412eeaccf1d1b999046a32b0e7a5a40b
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/bartoncls
3
0
0
3
6.12244897959184
0
0
24
48.9795918367347
49
RepliedTo Commented
Commented RepliedTo
u_Octoparseideas
u_Octoparseideas
software desktop completely video confusing web demonstrating requires automated based
completely video confusing web demonstrating requires automated based thus things
desktop,software cloud,still completely,unusable confusing,video install,desktop requires,manual manual,fiddling octoparse,web video,demonstrating call,automated
cloud,still completely,unusable confusing,video install,desktop requires,manual manual,fiddling octoparse,web video,demonstrating call,automated unusable,use
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
githubpermalinkbot
githubpermalinkbot
1
9340.8974609375
8396.595703125
0
1
0
0.004401
0
0.002219
0
0
333
GitHubPermalinkBot
4/9/2017 4:38:54 PM
0
1
1127
0
False
False
False
False
True
False
t2_16xed7
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
False
True
Open Reddit Page for This Person
https://www.reddit.com/user/GitHubPermalinkBot
27
1
1.08695652173913
1
1.08695652173913
0
0
56
60.8695652173913
92
RepliedTo
RepliedTo
Python
Python
github links permanent samples files help articles python delete api
github links permanent samples files help articles python delete api
permanent,links youtube,api api,samples articles,getting getting,permanent links,files help,github github,articles specific,commit python,master
permanent,links youtube,api api,samples articles,getting getting,permanent links,files help,github github,articles specific,commit python,master
142.857142857143
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
alanjcastonguay
alanjcastonguay
76.3302637401968
9130.875
7071.005859375
3
1
6
0.007335
0
0.002919
0
0.333333333333333
334
alanjcastonguay
3/5/2011 5:04:35 AM
0
8022
83664
0
False
False
False
False
True
False
t2_4x7bc
False
True
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/alanjcastonguay
27
0
0
0
0
0
0
12
52.1739130434783
23
Commented RepliedTo
Commented RepliedTo
Python
Python
way gets documentation daemon word see time cron scheduler platform's
way gets documentation daemon word see time cron scheduler platform's
time,see daemon,chronos gets,way word,time see,platform's greek,word cron,gets platform's,documentation chronos,greek scheduler,daemon
time,see daemon,chronos gets,way word,time see,platform's greek,word cron,gets platform's,documentation chronos,greek scheduler,daemon
100
https://styles.redditmedia.com/t5_23e7gx/styles/profileIcon_9yofle8adn761.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=32d75e1df5258825912b0e0885d048bbd03c80d5
jozehgs
jozehgs
1
2416.3759765625
1220.16369628906
2
2
0
0.012538
0
0.002375
0
1
335
JoZeHgS
8/19/2019 4:55:33 PM
0
254
1167
0
False
False
False
False
True
False
t2_48vn02uu
False
False
False
https://styles.redditmedia.com/t5_23e7gx/styles/profileIcon_9yofle8adn761.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=32d75e1df5258825912b0e0885d048bbd03c80d5
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/JoZeHgS
4
12
3.68098159509202
2
0.613496932515337
0
0
144
44.1717791411043
326
Posted RepliedTo
RepliedTo Posted
webscraping webdev
webdev webscraping
python thanks lot octoparse extract everyone hi way photos variation
extract photos variation python x aliseeks use search ones 3d
hi,everyone thanks,lot scraping,python python,vs much,faster fastest,way python,fastest way,thanks faster,scraping octoparse,python
scraping,python python,vs much,faster fastest,way python,fastest way,thanks faster,scraping octoparse,python vs,octoparse functionality,seem
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
iamearthgirl
iamearthgirl
1
2083.142578125
4965.671875
1
1
0
0
0
0.002439
0
0
336
IAmEarthGirl
12/31/2020 9:44:43 PM
0
1
0
0
False
False
False
False
True
False
t2_9lcbecw6
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/IAmEarthGirl
1
3
1.40845070422535
3
1.40845070422535
0
0
105
49.2957746478873
213
Posted Commented
Commented Posted
webscraping
webscraping
data gt lt span before placeholders loads element modal website
gt lt span placeholders loads element modal website class need
lt,span before,data loads,lt span,class class,dynamic_field_item_id data,loads span,gt dynamic_field_item_id,gt gather,website human,eye
lt,span loads,lt span,class class,dynamic_field_item_id data,loads span,gt dynamic_field_item_id,gt gather,website human,eye technically,visible
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
fooyili
fooyili
1
3750.384765625
323.936462402344
2
1
0
0.01063
0
0.002315
0
0
337
fOOyili
1/7/2019 10:16:05 AM
0
2
0
0
False
False
False
False
True
False
t2_2xy1wc41
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/fOOyili
9
0
0
0
0
0
0
1
100
1
Posted
Posted
content_marketing
content_marketing
removed
removed
1000
https://styles.redditmedia.com/t5_2cc4ru/styles/profileIcon_wgodt94uzl941.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=39c4d67311e18084b08e2695475e15dcad23e284
larfleeeze
larfleeeze
6998.34449434882
953.8017578125
1633.09826660156
7
4
557.333333
0.031805
2E-06
0.0033
0
0.5
338
larfleeeze
1/8/2020 2:40:16 PM
0
318
111
0
False
False
False
False
True
False
t2_4d014jwg
False
False
False
https://styles.redditmedia.com/t5_2cc4ru/styles/profileIcon_wgodt94uzl941.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=39c4d67311e18084b08e2695475e15dcad23e284
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/larfleeeze
2
7
3.51758793969849
4
2.01005025125628
0
0
85
42.713567839196
199
RepliedTo Posted Commented
Commented Posted RepliedTo
webscraping
webscraping
octoparse thanks such lot softwares people free suppose scrap consider
softwares such people suppose scrap octoparse free consider try websites
thanks,lot octoparse,thanks team,people using,octoparse free,plan discovered,softwares pay,days student,such start,scratch until,10
thanks,lot octoparse,thanks team,people using,octoparse free,plan discovered,softwares pay,days student,such start,scratch until,10
100
https://styles.redditmedia.com/t5_ew903/styles/profileIcon_56kvprx3ybz61.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=c6fdc03c8b00271b0f013320eb4f28b8e8d8add4
jcrowe
jcrowe
1
822.55322265625
2027.85986328125
1
1
0
0.024148
0
0.002144
0
1
339
jcrowe
12/7/2010 2:52:29 PM
0
705
4569
0
False
False
False
False
True
False
t2_4lfh2
False
False
False
https://styles.redditmedia.com/t5_ew903/styles/profileIcon_56kvprx3ybz61.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=c6fdc03c8b00271b0f013320eb4f28b8e8d8add4
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/jcrowe
2
0
0
0
0
0
0
6
33.3333333333333
18
Commented
Commented
webscraping
webscraping
beginning one build need use tool
beginning one build need use tool
tool,beginning use,tool one,need need,use beginning,build
tool,beginning use,tool one,need need,use beginning,build
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
mclukej
mclukej
1
684.702270507813
1900.22741699219
1
1
0
0.024148
0
0.002144
0
1
340
McLukeJ
9/25/2020 8:49:45 PM
0
3
18
0
False
False
False
False
True
False
t2_88e7aby0
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/McLukeJ
2
1
1.49253731343284
1
1.49253731343284
0
0
32
47.7611940298507
67
Commented
Commented
webscraping
webscraping
those tool need found way came more similar selenium adequate
those tool need found way came more similar selenium adequate
attention,those use,existing adequate,part found,adequate before,use need,scraping came,similar tool,until things,bs part,tool
attention,those use,existing adequate,part found,adequate before,use need,scraping came,similar tool,until things,bs part,tool
100
https://styles.redditmedia.com/t5_sc8sy/styles/profileIcon_fdrbilvunsq61.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=e8ccb7771acb6c30b5419f966005631126f6bd7d
prof_happy
prof_happy
1
1449.64184570313
490.911773681641
0
1
0
0.019658
0
0.002259
0
0
341
prof_happy
12/5/2018 12:49:13 PM
0
10313
1549
0
False
False
False
False
True
False
t2_2q6f93pa
False
False
True
https://styles.redditmedia.com/t5_sc8sy/styles/profileIcon_fdrbilvunsq61.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=e8ccb7771acb6c30b5419f966005631126f6bd7d
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/prof_happy
2
2
7.69230769230769
0
0
0
0
10
38.4615384615385
26
RepliedTo
RepliedTo
webscraping
webscraping
couple scraping yes wondering web ve people toolset paid well
couple scraping yes wondering web ve people toolset paid well
yes,people capability,toolset scraping,couple people,paid ve,web couple,years well,wondering wondering,capability years,well web,scraping
yes,people capability,toolset scraping,couple people,paid ve,web couple,years well,wondering wondering,capability years,well web,scraping
657.142857142857
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
dogmaticambivalence
dogmaticambivalence
980.293428622559
1116.85498046875
1448.09338378906
2
1
78
0.02445
0
0.002483
0
0.5
342
DogmaticAmbivalence
2/18/2018 8:46:34 PM
0
34
6359
0
False
False
False
False
True
False
t2_xp362o6
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/DogmaticAmbivalence
2
2
2.04081632653061
1
1.02040816326531
0
0
44
44.8979591836735
98
Commented
Commented
webscraping
webscraping
started stuff toolset terms help people found anything tools octoparse
started stuff toolset terms help people found anything tools octoparse
money,stuff anything,beats beats,home python,libraries really,pay started,pre days,seems hadn't,heard home,library stuff,makes
money,stuff anything,beats beats,home python,libraries really,pay started,pre days,seems hadn't,heard home,library stuff,makes
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
timee_bot
timee_bot
1
2444.60083007813
6789.43310546875
0
1
0
0.023961
0.153925
0.002133
0
0
343
timee_bot
7/27/2017 4:35:11 PM
0
1
102322
0
False
False
False
False
True
False
t2_8j8fti5
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/timee_bot
3
0
0
0
0
0
0
6
50
12
Commented
Commented
u_Octoparseideas
u_Octoparseideas
59 11 tonight view pm timezone
59 11 tonight view pm timezone
view,timezone 11,59 timezone,tonight 59,pm tonight,11
view,timezone 11,59 timezone,tonight 59,pm tonight,11
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
dondontootles
dondontootles
1
8718.95703125
8467.814453125
1
1
0
0.005589
0
0.002178
0
1
344
Dondontootles
11/2/2012 2:57:38 AM
0
1484
17548
0
False
False
False
False
True
False
t2_9h9uh
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Dondontootles
17
1
1.81818181818182
0
0
0
0
24
43.6363636363636
55
Commented
Commented
learnprogramming
learnprogramming
need twitter think minimal backend find node programming load simply
need twitter think minimal backend find node programming load simply
javascript,simply simply,ajax use,node need,find available,those twitter,widgets keyword,search those,sort programming,ll backend,otherwise
javascript,simply simply,ajax use,node need,find available,those twitter,widgets keyword,search those,sort programming,ll backend,otherwise
100
https://styles.redditmedia.com/t5_354l4w/styles/profileIcon_icpwfclk6ox51.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=574aefb9d138beac97cf8b157daf2997872fb9ea
hak122hak
hak122hak
1
8516.7890625
9512.9208984375
0
1
0
0.005589
0
0.002178
0
0
345
hak122hak
9/19/2020 6:26:18 AM
0
22
4
0
False
False
False
False
True
False
t2_85da9ucq
False
False
False
https://styles.redditmedia.com/t5_354l4w/styles/profileIcon_icpwfclk6ox51.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=574aefb9d138beac97cf8b157daf2997872fb9ea
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/hak122hak
17
0
0
0
0
0
0
32
53.3333333333333
60
Commented
Commented
learnprogramming
learnprogramming
python 000 twitter tweets more library snscrape send application bot
python 000 twitter tweets more library snscrape send application bot
more,000 twitter,python 000,tweets 000,000 youtube,python telegram,application playlists,youtube application,python tweets,twitter python,udemy
more,000 twitter,python 000,tweets 000,000 youtube,python telegram,application playlists,youtube application,python tweets,twitter python,udemy
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
yannick_data
yannick_data
1
1135.37414550781
8935.3603515625
1
1
0
0
0
0.002439
0
0
346
yannick_data
6/6/2022 8:29:11 PM
0
1
0
0
False
False
False
False
True
False
t2_kl17jkcu
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/yannick_data
1
2
0.278551532033426
4
0.557103064066852
0
0
438
61.0027855153203
718
Posted
Posted
u_yannick_data
u_yannick_data
avis ce webscraping site sont darty fr clients scraping cas
avis ce webscraping site sont darty fr clients scraping cas
avis,clients considéré,comme transformation,digitale web,legal prenons,exemple fr,transformation contrat,confiance trusted,shop cas,usage site,accepte
avis,clients considéré,comme transformation,digitale web,legal prenons,exemple fr,transformation contrat,confiance trusted,shop cas,usage site,accepte
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
wikitextbot
wikitextbot
1
8745.048828125
7022.08984375
0
1
0
0.004401
0
0.002219
0
0
347
WikiTextBot
6/4/2017 11:49:39 AM
0
3141592
3141592
0
False
False
False
False
True
False
t2_32duad4
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/WikiTextBot
27
1
0.709219858156028
0
0
0
0
90
63.8297872340426
141
RepliedTo
RepliedTo
Python
Python
cron reddit wikitextbot message intervals exclude np software excludeme time
cron reddit wikitextbot message intervals exclude np software excludeme time
np,reddit reddit,message message,compose systems,people subject,excludeme χρόνος,chronos commands,shell operating,systems excludeme,exclude people,set
np,reddit reddit,message message,compose systems,people subject,excludeme χρόνος,chronos commands,shell operating,systems excludeme,exclude people,set
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
cokoop
cokoop
1
8965.146484375
7178.720703125
2
2
0
0.004401
0
0.002399
0
1
348
cokoop
5/11/2015 3:36:10 AM
0
60
8129
0
False
False
False
False
True
False
t2_nek14
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/cokoop
27
3
1.3215859030837
3
1.3215859030837
0
0
79
34.8017621145374
227
Posted RepliedTo
RepliedTo Posted
Python
Python
website need given something pictures create copy exist 15th back
website need given something pictures create copy exist 15th back
website,specifically involved,thanks figure,know built,create smart,guy ask,cron date,stamped commercially,need day's,website guy,write
website,specifically involved,thanks figure,know built,create smart,guy ask,cron date,stamped commercially,need day's,website guy,write
100
https://styles.redditmedia.com/t5_76z71z/styles/profileIcon_odpgx41m2it91.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5d9ba5dff421abfb605bfe2c1b14f259b10dbca4
octoparse-hola
octoparse-hola
1
1451.29711914063
8935.3603515625
1
1
0
0
0
0.002439
0
0
349
Octoparse-hola
10/13/2022 4:20:26 AM
0
1
0
0
False
False
False
False
True
False
t2_sw1uojjf
False
False
False
https://styles.redditmedia.com/t5_76z71z/styles/profileIcon_odpgx41m2it91.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5d9ba5dff421abfb605bfe2c1b14f259b10dbca4
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Octoparse-hola
1
0
0
0
0
0
0
200
69.9300699300699
286
Posted
Posted
u_Octoparse-hola
u_Octoparse-hola
créditos ahorrar anual personalizado captcha proxies capacitación black crawler friday
first ffd7d899d266bffdeabc3d5c7dbe2063a2ddb54f utm_campaign 16 d0dc67a3aff84eabec36458a6c0a6a806396810d utm_medium primer día social 30
anual,ahorrar crawler,personalizado créditos,plantillas resolución,captcha créditos,proxies proxies,residenciales créditos,resolución plantillas,crawler personalizado,capacitación black,friday
octoparse,black hoy,primer qxib1qnn6o0a1,jpg primer,día descuento,créditos 500,15gb actualize,plan enabled,ffd7d899d266bffdeabc3d5c7dbe2063a2ddb54f 201,3gb d0dc67a3aff84eabec36458a6c0a6a806396810d,black
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
leo_paredes2354
leo_paredes2354
1
9781.1142578125
1974.517578125
0
1
0
0.002445
0
0.002269
0
0
350
Leo_Paredes2354
9/2/2020 7:04:33 PM
0
1
0
0
False
False
False
False
True
False
t2_7wunphz3
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Leo_Paredes2354
54
3
5.17241379310345
1
1.72413793103448
0
0
27
46.551724137931
58
Commented
Commented
webdev
webdev
company finddatalab done image service convenient quality think etc always
company finddatalab done image service convenient quality think etc always
job,perfectly done,job image,json data,site high,quality finddatalab,web json,ftp pdf,image repeatedly,used etc,repeatedly
job,perfectly done,job image,json data,site high,quality finddatalab,web json,ftp pdf,image repeatedly,used etc,repeatedly
100
https://styles.redditmedia.com/t5_gxhmk/styles/profileIcon_snooba44bbbf-884f-48bf-b6c5-ea8c2c38b4b9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a8ca370da6de70762414f940850c2b2a3dae4d8f
themightykrusher
themightykrusher
1
9781.1142578125
1451.06567382813
2
1
0
0.002445
0
0.002609
0
0
351
themightykrusher
3/16/2018 6:14:52 PM
0
52
48
0
False
False
False
False
True
False
t2_4ddcc3
False
False
False
https://styles.redditmedia.com/t5_gxhmk/styles/profileIcon_snooba44bbbf-884f-48bf-b6c5-ea8c2c38b4b9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a8ca370da6de70762414f940850c2b2a3dae4d8f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/themightykrusher
54
7
3.46534653465347
2
0.99009900990099
0
0
90
44.5544554455446
202
Posted
Posted
webdev
webdev
jobs web react indeed octoparse moment scraping tried documentation service
jobs web react indeed octoparse moment scraping tried documentation service
indeed,glassdoor html,css project,react scrape,jobs confusing,anyone approved,aka bit,confusing jobs,tried react,looking sort,works
indeed,glassdoor html,css project,react scrape,jobs confusing,anyone approved,aka bit,confusing jobs,tried react,looking sort,works
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
esty_
esty_
1
4307.8447265625
1033.25402832031
2
1
0
0.01063
0
0.002315
0
0
352
esty_
11/28/2018 2:38:42 PM
0
12
2
0
False
False
False
False
True
False
t2_2omkubiu
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/esty_
9
1
0.588235294117647
1
0.588235294117647
0
0
87
51.1764705882353
170
Posted
Posted
googlesheets
googlesheets
div airbnb importxml section data pasteboard c2 one import j6tr8f8
div airbnb importxml section data pasteboard c2 one import j6tr8f8
div,div class,'_m2z73r' section,div airbnb,ianw6066 id,'site div,section co,j6tr8f8 j6tr8f8,png importxml,c2 content',div
div,div class,'_m2z73r' section,div airbnb,ianw6066 id,'site div,section co,j6tr8f8 j6tr8f8,png importxml,c2 content',div
100
https://styles.redditmedia.com/t5_twkxr/styles/profileIcon_snoo82b1edaa-7c95-4fe9-aa9f-b6f2bf1ad836-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=cd6e513bd90ac2d34443e38d275da9fa329f7463
dslakers
dslakers
1
4002.2470703125
9927.7822265625
0
1
0
0.019294
1.4E-05
0.002139
0
0
353
dslakers
1/2/2019 3:43:54 PM
0
126
11949
0
False
False
False
False
True
False
t2_2vgnisjj
False
False
False
https://styles.redditmedia.com/t5_twkxr/styles/profileIcon_snoo82b1edaa-7c95-4fe9-aa9f-b6f2bf1ad836-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=cd6e513bd90ac2d34443e38d275da9fa329f7463
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/dslakers
6
4
8.16326530612245
1
2.04081632653061
0
0
21
42.8571428571429
49
Commented
Commented
webscraping
webscraping
import io standard gold think very offers clients freemium ui
import io standard gold think very offers clients freemium ui
import,io offers,available well,very seems,still towards,larger still,freemium larger,clients used,tool nice,ui freemium,offers
import,io offers,available well,very seems,still towards,larger still,freemium larger,clients used,tool nice,ui freemium,offers
100
https://styles.redditmedia.com/t5_lycam/styles/profileIcon_snoo1a2837ed-0f5b-40ee-873d-08a26840df91-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b0a3c468f4e1fca8930bd539fcb75b81ca174951
hiren_p
hiren_p
1
3748.93603515625
9241.673828125
0
1
0
0.019294
1.4E-05
0.002139
0
0
354
hiren_p
7/18/2018 9:50:26 AM
0
738
39
0
False
False
False
False
True
False
t2_1suj55nv
False
False
False
https://styles.redditmedia.com/t5_lycam/styles/profileIcon_snoo1a2837ed-0f5b-40ee-873d-08a26840df91-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b0a3c468f4e1fca8930bd539fcb75b81ca174951
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/hiren_p
6
4
8.16326530612245
0
0
0
0
18
36.734693877551
49
RepliedTo
RepliedTo
webscraping
webscraping
prowebscraper pricing think create accommodate scrapers curve scrape fast easy
prowebscraper pricing think create accommodate scrapers curve scrape fast easy
unlimited,scrapers use,gui pricing,accommodate prowebscraper,pricing scrape,data scrapers,importantly learning,curve fast,learning pricing,prowebscraper create,unlimited
unlimited,scrapers use,gui pricing,accommodate prowebscraper,pricing scrape,data scrapers,importantly learning,curve fast,learning pricing,prowebscraper create,unlimited
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
pip_install_escher
pip_install_escher
1
3809.11840820313
8040.21923828125
1
1
0
0.019294
1.4E-05
0.002139
0
1
355
pip_install_Escher
3/4/2019 5:42:03 PM
0
10
115
0
False
False
False
False
True
False
t2_3a8hi50t
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/pip_install_Escher
6
1
4.54545454545455
0
0
0
0
10
45.4545454545455
22
RepliedTo
RepliedTo
webscraping
webscraping
good monetize something data plan used scrapy similar far doing
good monetize something data plan used scrapy similar far doing
something,similar used,scrapy doing,something monetize,data plan,monetize scrapy,far data,set similar,used far,good set,doing
something,similar used,scrapy doing,something monetize,data plan,monetize scrapy,far data,set similar,used far,good set,doing
100
https://styles.redditmedia.com/t5_7vd4s/styles/profileIcon_snoo7e1866a7-d284-4947-9f98-585241511bd8-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=eaa60c7f305c4ffeeba3427bb19cc76abaa4b0e8
k_smith182
k_smith182
1
3850.439453125
9746.2255859375
1
1
0
0.019294
1.4E-05
0.002139
0
1
356
k_smith182
6/26/2017 7:58:59 PM
0
13
52
0
False
False
False
False
True
False
t2_6ccnzt
False
False
False
https://styles.redditmedia.com/t5_7vd4s/styles/profileIcon_snoo7e1866a7-d284-4947-9f98-585241511bd8-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=eaa60c7f305c4ffeeba3427bb19cc76abaa4b0e8
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/k_smith182
6
3
2.04081632653061
1
0.680272108843537
0
0
66
44.8979591836735
147
Commented RepliedTo
Commented RepliedTo
webscraping
webscraping
solution import outsource shalion io more ecommerce skip gather ad
solution import outsource shalion io ecommerce skip gather ad marketing
import,io outsource,solution experts,ecommerce suggest,one more,details data,using one,tell although,tool between,each stuff,python
import,io outsource,solution experts,ecommerce suggest,one more,details data,using one,tell although,tool between,each stuff,python
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
buneme
buneme
1
3953.01293945313
7883.29833984375
0
1
0
0.019294
1.4E-05
0.002139
0
0
357
buneme
9/7/2015 6:25:18 PM
0
145
19
0
False
False
False
False
True
False
t2_q967h
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/buneme
6
4
6.89655172413793
0
0
0
0
27
46.551724137931
58
Commented
Commented
webscraping
webscraping
more requests product techniques lot affordable start scrape interested free
more requests product techniques lot affordable start scrape interested free
paid,plans more,information lot,more 50,monthly product,50 monthly,requests scrape,data data,product built,api 15,month
paid,plans more,information lot,more 50,monthly product,50 monthly,requests scrape,data data,product built,api 15,month
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
plenty-explorer-9854
plenty-explorer-9854
1
3735.17333984375
8601.498046875
0
1
0
0.019294
1.4E-05
0.002139
0
0
358
Plenty-Explorer-9854
9/8/2021 5:54:26 AM
0
1
-1
0
False
False
False
False
True
False
t2_eel32bvr
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Plenty-Explorer-9854
6
4
6.34920634920635
1
1.58730158730159
0
0
30
47.6190476190476
63
Commented
Commented
webscraping
webscraping
proxycrawl websites data solutions plenty trust blocks 500 extraction reliable
proxycrawl websites data solutions plenty trust blocks 500 extraction reliable
proxycrawl,many proxycrawl,proxycrawl reliable,api extraction,needs companies,use plenty,try websites,extract without,worrying programmatically,requesting needs,proxycrawl
proxycrawl,many proxycrawl,proxycrawl reliable,api extraction,needs companies,use plenty,try websites,extract without,worrying programmatically,requesting needs,proxycrawl
100
https://styles.redditmedia.com/t5_7d2ecz/styles/profileIcon_nkirdsfc5h1a1.png?width=256&height=256&crop=256:256,smart&v=enabled&s=adaf076b499c22bd2a8abf0b93c751d41b4c0cf3
thesentimentai
thesentimentai
1
503.528503417969
8935.3603515625
1
1
0
0
0
0.002439
0
0
359
Thesentimentai
11/11/2022 12:23:01 PM
0
1
0
0
False
False
False
False
True
False
t2_u70z55di
False
False
False
https://styles.redditmedia.com/t5_7d2ecz/styles/profileIcon_nkirdsfc5h1a1.png?width=256&height=256&crop=256:256,smart&v=enabled&s=adaf076b499c22bd2a8abf0b93c751d41b4c0cf3
False
False
False
True
Open Reddit Page for This Person
https://www.reddit.com/user/Thesentimentai
1
26
4.4750430292599
4
0.688468158347676
0
0
311
53.5283993115318
581
Posted
Posted
u_Thesentimentai
u_Thesentimentai
social media scraping data one png ## well information thesentimentai
social media scraping data one png ## well information thesentimentai
social,media media,scraping ##,social scraping,services preview,redd auto,webp thesentimentai,#services media,sites format,png web,scraping
social,media media,scraping ##,social scraping,services preview,redd auto,webp thesentimentai,#services media,sites format,png web,scraping
1000
https://styles.redditmedia.com/t5_6zd821/styles/profileIcon_snoobe2cc1ba-8bf7-4022-b9af-c780facdb17f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ede7798a52e365dd56c999099c61e0ca7fc6f12f
fantomhouse
fantomhouse
2763.10967047388
3275.748046875
1060.83178710938
7
3
220
0.016576
0
0.003649
0
0.333333333333333
360
FantomHouse
9/3/2022 7:13:44 PM
0
19
0
0
False
False
False
False
True
False
t2_s671nryl
False
False
True
https://styles.redditmedia.com/t5_6zd821/styles/profileIcon_snoobe2cc1ba-8bf7-4022-b9af-c780facdb17f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=ede7798a52e365dd56c999099c61e0ca7fc6f12f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/FantomHouse
4
7
2.86885245901639
13
5.32786885245902
0
0
91
37.2950819672131
244
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
discontinued free charged account bank card trial read except few
few charged read free account bank card trial except call
free,trial really,much still,charged access,acc bank,transaction read,plz record,disabled provide,except disabled,account transaction,record
free,trial really,much still,charged access,acc bank,transaction read,plz record,disabled provide,except disabled,account transaction,record
100
https://styles.redditmedia.com/t5_5i4kxq/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xODIyMDk_rare_e1077416-d3bc-4f86-849a-6bbc28a4df5a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=92728804bbd66e5971cf6231803a220249067dd8
--silas--
--silas--
1
3474.73706054688
1093.30847167969
0
1
0
0.012538
0
0.002151
0
0
361
--silas--
12/16/2021 8:47:31 PM
0
1123
3083
0
False
False
False
False
True
False
t2_h7ikbhpz
False
False
True
https://styles.redditmedia.com/t5_5i4kxq/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N18xODIyMDk_rare_e1077416-d3bc-4f86-849a-6bbc28a4df5a-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=92728804bbd66e5971cf6231803a220249067dd8
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/--silas--
4
1
5.26315789473684
1
5.26315789473684
0
0
8
42.1052631578947
19
RepliedTo
RepliedTo
webscraping
webscraping
privacy unexpected try free saved time lot charges next
privacy unexpected try free saved time lot charges next
privacy,next unexpected,charges next,time time,free try,privacy lot,unexpected privacy,privacy free,saved saved,lot
privacy,next unexpected,charges next,time time,free try,privacy lot,unexpected privacy,privacy free,saved saved,lot
100
https://styles.redditmedia.com/t5_71fjsg/styles/profileIcon_snooacebda17-1828-4ed7-9e06-c5ab0dbd00ed-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f09bcca0f875a7f03392aa98fa91405c3c971c05
bhushankumar_fst
bhushankumar_fst
1
3705.529296875
1620.66918945313
0
1
0
0.010295
0
0.00226
0
0
362
bhushankumar_fst
9/14/2022 7:06:05 PM
0
1
0
0
False
False
False
False
True
False
t2_shrzapxb
False
False
False
https://styles.redditmedia.com/t5_71fjsg/styles/profileIcon_snooacebda17-1828-4ed7-9e06-c5ab0dbd00ed-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f09bcca0f875a7f03392aa98fa91405c3c971c05
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/bhushankumar_fst
4
4
12.9032258064516
0
0
0
0
13
41.9354838709677
31
RepliedTo
RepliedTo
webscraping
webscraping
free extraction credit interested csv quickscraper feel plan excel card
free extraction credit interested csv quickscraper feel plan excel card
credit,card json,csv free,contact excel,interested interested,feel directly,json free,plan csv,excel extraction,directly data,extraction
credit,card json,csv free,contact excel,interested interested,feel directly,json free,plan csv,excel extraction,directly data,extraction
371.428571428571
https://styles.redditmedia.com/t5_j9fhh/styles/profileIcon_snoo39139160-0555-4ac6-8b7d-78424e0249e5-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=60892922c60a9ec29aa175a9731feb984dba5f71
gnobile
gnobile
478.09167035458
3504.50634765625
1393.0029296875
1
1
38
0.012868
0
0.00249
0
0
363
gnobile
5/16/2018 5:51:29 PM
0
1
-40
0
False
False
False
False
True
False
t2_1dsm99wg
False
False
False
https://styles.redditmedia.com/t5_j9fhh/styles/profileIcon_snoo39139160-0555-4ac6-8b7d-78424e0249e5-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=60892922c60a9ec29aa175a9731feb984dba5f71
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/gnobile
4
2
3.07692307692308
0
0
0
0
30
46.1538461538462
65
Commented
Commented
webscraping
webscraping
canceled trial same deal few testing experience mom credit year
canceled trial same deal few testing experience mom credit year
avoid,trial revenue,finally trial,charged charged,monthly experience,paramount credit,card another,year monthly,fee few,years use,canceled
avoid,trial revenue,finally trial,charged charged,monthly experience,paramount credit,card another,year monthly,fee few,years use,canceled
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
crypto_eagle
crypto_eagle
1
3186.83422851563
626.125061035156
0
1
0
0.012538
0
0.002151
0
0
364
Crypto_Eagle
12/17/2017 1:00:01 AM
0
94
137
0
False
False
False
False
True
False
t2_okftz4d
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Crypto_Eagle
4
1
3.33333333333333
3
10
0
0
10
33.3333333333333
30
Commented
Commented
webscraping
webscraping
think cancel email stuff mistake reality company claim refund people
think cancel email stuff mistake reality company claim refund people
company,scam refund,mistake email,asking people,think asking,refund reality,didn didn,email think,cancel mistake,falsely falsely,claim
company,scam refund,mistake email,asking people,think asking,refund reality,didn didn,email think,cancel mistake,falsely falsely,claim
100
https://styles.redditmedia.com/t5_b0qn8/styles/profileIcon_qdye6m0wg1ga1.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d3dff0844587329ade5f37c2582f24c796cf3af6
vanlombardi
vanlombardi
1
3447.82055664063
831.003540039063
0
1
0
0.012538
0
0.002151
0
0
365
Vanlombardi
8/30/2016 11:08:20 PM
0
53
20
0
False
False
False
False
True
False
t2_1112du
False
False
False
https://styles.redditmedia.com/t5_b0qn8/styles/profileIcon_qdye6m0wg1ga1.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d3dff0844587329ade5f37c2582f24c796cf3af6
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Vanlombardi
4
2
4.65116279069767
1
2.32558139534884
0
0
15
34.8837209302326
43
Commented
Commented
webscraping
webscraping
mrscraper mrdiscount founder try instead tier free account disclaimer cheap
mrscraper mrdiscount founder try instead tier free account disclaimer cheap
try,mrscraper instead,free disclaimer,founder cheap,pro founder,cheap mrscraper,mrscraper paying,service use,mrdiscount happy,paying tier,disclaimer
try,mrscraper instead,free disclaimer,founder cheap,pro founder,cheap mrscraper,mrscraper paying,service use,mrdiscount happy,paying tier,disclaimer
100
https://styles.redditmedia.com/t5_3nc97/styles/profileIcon_snoof5745658-483b-4e57-9081-27390ea44e69-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a2fe30985fa30ccc42942ffd149704884ce86dc5
makyol48
makyol48
1
819.451293945313
8935.3603515625
1
1
0
0
0
0.002439
0
0
366
makyol48
6/5/2009 8:34:45 PM
0
319
48
0
False
False
False
False
True
False
t2_3i4vi
False
False
True
https://styles.redditmedia.com/t5_3nc97/styles/profileIcon_snoof5745658-483b-4e57-9081-27390ea44e69-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a2fe30985fa30ccc42942ffd149704884ce86dc5
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/makyol48
1
Posted
Posted
startupbuffer
startupbuffer
100
https://styles.redditmedia.com/t5_dtqh0/styles/profileIcon_snoo4fd06c50-6cc1-43b0-9767-aff8394ed137-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c9912d6906d6446d5df4ab27fd893a93c728f4c2
guattarist
guattarist
1
2155.7255859375
934.293090820313
1
1
0
0.018808
0
0.002153
0
1
367
guattarist
11/10/2011 3:27:47 PM
0
2245
6448
0
False
False
False
False
True
False
t2_67vzd
False
False
False
https://styles.redditmedia.com/t5_dtqh0/styles/profileIcon_snoo4fd06c50-6cc1-43b0-9767-aff8394ed137-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c9912d6906d6446d5df4ab27fd893a93c728f4c2
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/guattarist
2
1
5.55555555555556
0
0
0
0
6
33.3333333333333
18
Commented
Commented
datascience
datascience
beautiful scraping thing ve soup anymore done
beautiful scraping thing ve soup anymore done
beautiful,soup scraping,beautiful ve,done done,scraping soup,thing thing,anymore
beautiful,soup scraping,beautiful ve,done done,scraping soup,thing thing,anymore
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
nivvy_miz
nivvy_miz
1
8750.9775390625
71.2179489135742
2
2
0
0.002445
0
0.002609
0
1
368
Nivvy_Miz
2/13/2020 9:35:32 PM
0
569
3172
0
False
False
False
False
True
False
t2_5of8tou2
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Nivvy_Miz
53
0
0
4
4.8780487804878
0
0
33
40.2439024390244
82
RepliedTo Posted
Posted RepliedTo
webscraping
webscraping
octoparse don ve need program stupid look called many error
octoparse don ve need program stupid look called many error
come,issue option,wondering many,hours called,octoparse scraping,using stupid,error anyone,knows isn,option understand,wrong need,web
come,issue option,wondering many,hours called,octoparse scraping,using stupid,error anyone,knows isn,option understand,wrong need,web
100
https://styles.redditmedia.com/t5_2ezea0/styles/profileIcon_snoo5b7c768f-8a1c-4156-9f6f-27a9fa6603c6-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b7e26078ca41940ffde41b3a8f1ec2cbcdabc673
tsk4ro
tsk4ro
1
8377.4599609375
1118.12182617188
1
1
0
0.002445
0
0.002269
0
1
369
Tsk4ro
2/9/2020 8:42:05 PM
0
12
10
0
False
False
False
False
True
False
t2_5n8kyzaa
False
False
False
https://styles.redditmedia.com/t5_2ezea0/styles/profileIcon_snoo5b7c768f-8a1c-4156-9f6f-27a9fa6603c6-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=b7e26078ca41940ffde41b3a8f1ec2cbcdabc673
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Tsk4ro
53
0
0
0
0
0
0
25
36.231884057971
69
Commented
Commented
webscraping
webscraping
scale big days information try scraping company data used week
scale big days information try scraping company data used week
try,company sites,collect shortly,scale week,shortly kind,applications receive,data time,kind data,week company,datamam used,service
try,company sites,collect shortly,scale week,shortly kind,applications receive,data time,kind data,week company,datamam used,service
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
databoy-thatsme
databoy-thatsme
1
8564.2197265625
1974.517578125
2
2
0
0.002445
0
0.002609
0
1
370
databoy-thatsme
11/5/2020 6:09:24 AM
0
35
237
0
False
False
False
False
True
False
t2_5d5gnn5z
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/databoy-thatsme
52
3
2.52100840336134
0
0
0
0
46
38.6554621848739
119
Posted RepliedTo
RepliedTo Posted
sweatystartup datascience
datascience sweatystartup
much website myself email directory tried first thanks appreciated help
website myself email directory tried first thanks appreciated help gonna
tried,myself directory,record thousands,entries minimal,coding website,directory myself,extremely first,thanks companies,contact email,thousands give,try
tried,myself directory,record thousands,entries minimal,coding website,directory myself,extremely first,thanks companies,contact email,thousands give,try
100
https://styles.redditmedia.com/t5_2c2p2w/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYTMzOTZhZjIwY2U1MmJkM2M3YWI2ZDcwNDZiZTYxNzI1N2Y2MGViOV80MjEz_rare_f6339ea6-c433-471f-b0f2-8673128aa592-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3105df336160a02ad9db0c4a97569d826e169b72
anyhoneydew4
anyhoneydew4
1
8564.2197265625
1451.06567382813
1
1
0
0.002445
0
0.002269
0
1
371
AnyHoneydew4
1/5/2020 6:54:54 PM
0
17430
3026
0
False
False
False
False
True
False
t2_5dmho6a5
False
False
True
https://styles.redditmedia.com/t5_2c2p2w/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYTMzOTZhZjIwY2U1MmJkM2M3YWI2ZDcwNDZiZTYxNzI1N2Y2MGViOV80MjEz_rare_f6339ea6-c433-471f-b0f2-8673128aa592-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3105df336160a02ad9db0c4a97569d826e169b72
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/AnyHoneydew4
52
3
7.5
0
0
0
0
16
40
40
Commented
Commented
sweatystartup
sweatystartup
use youtube channel explaining octoparse yellow link ve tool never
use youtube channel explaining octoparse yellow link ve tool never
use,tool parsehub,super channel,explaining easy,youtube ve,never tool,worth super,easy youtube,channel switching,link never,used
use,tool parsehub,super channel,explaining easy,youtube ve,never tool,worth super,easy youtube,channel switching,link never,used
100
https://styles.redditmedia.com/t5_5s8gwj/styles/profileIcon_snooa799b615-982b-4736-8b9f-fe0ddeb57b11-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=021a6f5ecd638b3318cc7e73508abef7aabb435a
user-new-wth
user-new-wth
1
2799.92846679688
8750.4033203125
1
1
0
0.023961
0.153925
0.002133
0
1
372
User-new-wth
2/2/2022 5:49:26 AM
0
1
-13
0
False
False
False
False
True
False
t2_j9ox2sgb
False
False
False
https://styles.redditmedia.com/t5_5s8gwj/styles/profileIcon_snooa799b615-982b-4736-8b9f-fe0ddeb57b11-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=021a6f5ecd638b3318cc7e73508abef7aabb435a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/User-new-wth
3
4
1.73160173160173
3
1.2987012987013
0
0
122
52.8138528138528
231
Commented
Commented
Octoparse_ideas
Octoparse_ideas
local run start window disable loop insert image load loading
run loop insert runs window captcha go button local disable
local,run load,images start,local images,local image,loading go,loop disable,image local,runs run,window button,available
local,run go,loop local,runs run,window load,images start,local images,local image,loading disable,image button,available
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
m_johny
m_johny
1
187.605651855469
8273.7451171875
1
1
0
0
0
0.002439
0
0
373
M_Johny
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/M_Johny
1
Posted
Posted
InternetIsBeautiful
InternetIsBeautiful
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
eghantous
eghantous
1
4131.1904296875
9072.6201171875
0
2
0
0.020019
2.2E-05
0.002226
1
0
374
eghantous
4/5/2020 10:22:53 PM
0
1
3
0
False
False
False
False
True
False
t2_4ff8gjv8
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/eghantous
6
2
2.7027027027027
1
1.35135135135135
0
0
31
41.8918918918919
74
Commented
Commented
webscraping
webscraping
inferlink rsx octoparse looking free # check different trouble tools
octoparse looking different trouble tools alternative think time continue ours
rsx,inferlink inferlink,# inferlink,rsx listed,different think,depends looking,much time,hands free,octoparse check,free hands,those
listed,different think,depends looking,much time,hands free,octoparse check,free hands,those octoparse,alternative tools,listed octoparse,check
100
https://styles.redditmedia.com/t5_5fo7wf/styles/profileIcon_snoo0ef2b594-c5c9-4c85-9e8d-4bf790117a16-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=eaa5449309cba4a3ae5d4c22081bdb69b1811c78
helenawilliam92
helenawilliam92
1
8564.2197265625
3092.6396484375
0
1
0
0.002445
0
0.002269
0
0
375
Helenawilliam92
12/4/2021 9:07:48 AM
0
1
0
0
False
False
False
False
True
False
t2_h7ktej7s
False
False
False
https://styles.redditmedia.com/t5_5fo7wf/styles/profileIcon_snoo0ef2b594-c5c9-4c85-9e8d-4bf790117a16-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=eaa5449309cba4a3ae5d4c22081bdb69b1811c78
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/Helenawilliam92
51
3
4.91803278688525
1
1.63934426229508
0
0
32
52.4590163934426
61
Commented
Commented
u_ScrapperExpert
u_ScrapperExpert
wersel io scraping web data hub site tools extraction integrated
wersel io scraping web data hub site tools extraction integrated
wersel,io web,scraping dealing,complicated complicated,websites free,demo focused,web visit,official scraping,agents scraping,tool comes,dealing
wersel,io web,scraping dealing,complicated complicated,websites free,demo focused,web visit,official scraping,agents scraping,tool comes,dealing
100
https://styles.redditmedia.com/t5_5f37q2/styles/profileIcon_snoo83a4ec35-41c7-4190-a8f0-4a5f73d3775d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=34aa7c9c5668a4e6987d412fb9d6dbd53e9bff05
scrapperexpert
scrapperexpert
1
8564.2197265625
2569.1875
2
1
0
0.002445
0
0.002609
0
0
376
ScrapperExpert
12/1/2021 8:55:26 AM
0
1
0
0
False
False
False
False
True
False
t2_ffd11odt
False
False
False
https://styles.redditmedia.com/t5_5f37q2/styles/profileIcon_snoo83a4ec35-41c7-4190-a8f0-4a5f73d3775d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=34aa7c9c5668a4e6987d412fb9d6dbd53e9bff05
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ScrapperExpert
51
7
6.19469026548673
0
0
0
0
65
57.5221238938053
113
Posted
Posted
u_ScrapperExpert
u_ScrapperExpert
web data extraction scraping monitoring octoparse many scraperapi price make
web data extraction scraping monitoring octoparse many scraperapi price make
web,data web,scraping data,extraction monitoring,price include,price fashion,called parsehub,happy best,web smarter,decisions scrapping,tools
web,data web,scraping data,extraction monitoring,price include,price fashion,called parsehub,happy best,web smarter,decisions scrapping,tools
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
individual_side313
individual_side313
1
4203.458984375
155.519760131836
2
1
0
0.01063
0
0.002315
0
0
377
Individual_Side313
11/11/2020 3:38:51 AM
0
1
0
0
False
False
False
False
True
False
t2_8u91ozfl
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Individual_Side313
9
1
4.16666666666667
0
0
0
0
12
50
24
Posted
Posted
excel
excel
data website each powerbi including details excel removed use octoparse
data website each powerbi including details excel removed use octoparse
each,data octoparse,extract use,octoparse including,details data,including data,excel extract,data website,website excel,powerbi powerbi,website
each,data octoparse,extract use,octoparse including,details data,including data,excel extract,data website,website excel,powerbi powerbi,website
100
https://styles.redditmedia.com/t5_6a0u04/styles/profileIcon_t4b1v9owphfa1.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a34d481cda2cb17666a581d57783a837ae4b2840
octoparse_de
octoparse_de
1
503.528503417969
8273.7451171875
1
1
0
0
0
0.002439
0
0
378
Octoparse_de
4/26/2022 8:57:08 AM
0
1
0
0
False
False
False
False
True
False
t2_lnhpg7bv
False
False
False
https://styles.redditmedia.com/t5_6a0u04/styles/profileIcon_t4b1v9owphfa1.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a34d481cda2cb17666a581d57783a837ae4b2840
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/Octoparse_de
1
6
0.263273365511189
60
2.63273365511189
0
0
1295
56.823168056165
2279
Posted
Posted
u_Octoparse_de
u_Octoparse_de
die octoparse gt lt li regex daten zeichen tabelle klicken
gt lt li die regex zeichen tabelle telefonnummern 7899 456
lt,li li,gt gt,lt 456,7899 123,456 octoparse,summer sale,2022 summer,sale die,daten daten,tabelle
lt,li li,gt gt,lt 456,7899 123,456 daten,tabelle regex,tool 7899,lt gt,123 die,daten
100
https://styles.redditmedia.com/t5_1onhnk/styles/profileIcon_snoo4afe2931-c31b-4cae-9062-c6a4caddca9d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=dc502762bcc6c8f643e95108d46c66b8d9f0d945
htepo
htepo
1
9781.1142578125
3092.6396484375
1
1
0
0.002445
0
0.002269
0
1
379
htepO
10/26/2014 5:54:24 AM
0
72552
100538
0
False
False
False
False
True
False
t2_j1uzh
False
False
True
https://styles.redditmedia.com/t5_1onhnk/styles/profileIcon_snoo4afe2931-c31b-4cae-9062-c6a4caddca9d-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=dc502762bcc6c8f643e95108d46c66b8d9f0d945
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/htepO
50
0
0
1
0.62111801242236
0
0
78
48.4472049689441
161
Commented RepliedTo
Commented RepliedTo
techsupport
techsupport
dictionary file ' item script write word url data html
dictionary ' item url data line term 'w' yourself yak
html,file yourself,upload generate,html script,generate dictionary,ball part,url try,yourself cat,dictionary ',title opposite,actually
yourself,upload generate,html script,generate dictionary,ball part,url try,yourself cat,dictionary ',title opposite,actually upload,full
100
https://styles.redditmedia.com/t5_2s38la/styles/profileIcon_snoo88e05c41-9519-4853-8b8c-67f9c938adc1-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7b8d7cd226b29f3bd0c123060cc53b1b1cc1850f
pastybums
pastybums
1
9781.1142578125
2569.1875
2
2
0
0.002445
0
0.002609
0
1
380
PastyBums
6/19/2020 11:19:02 PM
0
368
252
0
False
False
False
False
True
False
t2_6d6f8s6h
False
False
False
https://styles.redditmedia.com/t5_2s38la/styles/profileIcon_snoo88e05c41-9519-4853-8b8c-67f9c938adc1-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=7b8d7cd226b29f3bd0c123060cc53b1b1cc1850f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/PastyBums
50
7
2.5
8
2.85714285714286
0
0
116
41.4285714285714
280
Posted RepliedTo
RepliedTo Posted
techsupport
techsupport
word links data flair together words dictionary hyperlink different put
links data flair together words dictionary hyperlink different put banana
hyperlink,dictionary different,words itself,octoparse parse,urls put,programming better,describes original,words much,time ex,hyperlink access,#x200b
hyperlink,dictionary different,words itself,octoparse parse,urls put,programming better,describes original,words much,time ex,hyperlink access,#x200b
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
turningsteel
turningsteel
1
9969.35546875
7022.08984375
1
1
0
0.004401
0
0.002192
0
1
381
turningsteel
6/2/2013 3:15:32 AM
0
565
102203
0
False
False
False
False
True
False
t2_bw7bn
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/turningsteel
26
1
1.42857142857143
0
0
0
0
32
45.7142857142857
70
RepliedTo
RepliedTo
api
api
data key docs read need product good looking require start
data key need product good looking require start find requesting
read,docs require,key security,read url,returns find,go request,endpoint data,looking make,request data,laptops apis,require
require,key security,read url,returns find,go request,endpoint data,looking make,request data,laptops apis,require checking,product
142.857142857143
https://styles.redditmedia.com/t5_ch8i7/styles/profileIcon_a7kmz2laga941.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c60085165658e226003dae3f9ffee3023b948be1
rigg_enderslaye
rigg_enderslaye
76.3302637401968
9493.478515625
7467.0107421875
4
3
6
0.007335
0
0.003179
0
0.666666666666667
382
Rigg_Enderslaye
3/16/2016 11:25:13 PM
0
132
676
0
False
False
False
False
True
False
t2_wfyp7
False
False
False
https://styles.redditmedia.com/t5_ch8i7/styles/profileIcon_a7kmz2laga941.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c60085165658e226003dae3f9ffee3023b948be1
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Rigg_Enderslaye
26
7
4.54545454545455
1
0.649350649350649
0
0
60
38.961038961039
154
RepliedTo Posted
Posted RepliedTo
api
api
trying experience api buy price best name something start great
buy price best trying experience api name something start great
best,buy directly,generate api,function webcrawler,api correct,manner shot,go gsp,prices worth,shot experience,apis guess,worth
best,buy directly,generate api,function webcrawler,api correct,manner shot,go gsp,prices worth,shot experience,apis guess,worth
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
whattodo-whattodo
whattodo-whattodo
1
9370.54296875
8396.595703125
0
1
0
0.004401
0
0.002192
0
0
383
whattodo-whattodo
4/2/2014 10:57:46 PM
0
4779
38556
0
False
False
False
False
True
False
t2_fypio
False
True
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/whattodo-whattodo
26
0
0
1
2.7027027027027
0
0
16
43.2432432432432
37
RepliedTo
RepliedTo
api
api
simple curl working api dead familiar look use took plus
simple curl working api dead familiar look use took plus
api,dead dead,simple signed,api look,signed took,look give,working examples,method curl,familiar plus,give simple,plus
api,dead dead,simple signed,api look,signed took,look give,working examples,method curl,familiar plus,give simple,plus
100
https://styles.redditmedia.com/t5_18f3ya/styles/profileIcon_snoo2ef5d5ad-df4b-47c1-b5f9-21eab9c7fa08-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4231306426334b7b7288353667e4fed93a52cdae
dfish17
dfish17
1
9481.251953125
7872.39453125
1
1
0
0.004401
0
0.002192
0
1
384
dfish17
1/29/2017 11:24:44 PM
0
274
-100
0
False
False
False
False
True
False
t2_14wd2e
False
False
False
https://styles.redditmedia.com/t5_18f3ya/styles/profileIcon_snoo2ef5d5ad-df4b-47c1-b5f9-21eab9c7fa08-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4231306426334b7b7288353667e4fed93a52cdae
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/dfish17
26
1
3.33333333333333
1
3.33333333333333
0
0
11
36.6666666666667
30
Commented RepliedTo
Commented RepliedTo
api
api
products people key transactions large traffic aversion companies unlimited amount
products people key transactions large traffic aversion companies unlimited amount
transactions,people large,unlimited people,directing amount,transactions traffic,products allow,large key,companies registering,key directing,traffic companies,allow
transactions,people large,unlimited people,directing amount,transactions traffic,products allow,large key,companies registering,key directing,traffic companies,allow
100
https://styles.redditmedia.com/t5_7lwbu/styles/profileIcon_snoo13b3795e-0d1f-4cda-8f9a-2d8ffa9abdce-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e81a4fefd6b78fdf458758dc39daf957b451df01
alee001
alee001
1
7597.076171875
5264.787109375
0
1
0
0.00326
0
0.002268
0
0
385
alee001
5/24/2017 1:27:07 AM
0
1
357
0
False
False
False
False
True
False
t2_1yvlzy3
False
False
False
https://styles.redditmedia.com/t5_7lwbu/styles/profileIcon_snoo13b3795e-0d1f-4cda-8f9a-2d8ffa9abdce-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=e81a4fefd6b78fdf458758dc39daf957b451df01
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/alee001
38
0
0
0
0
0
0
9
42.8571428571429
21
RepliedTo
RepliedTo
webscraping
webscraping
rotation ip different second ip's 10 use place location
rotation ip different second ip's 10 use place location
10,second use,ip's ip,rotation rotation,place location,10 second,ip different,location ip's,different
10,second use,ip's ip,rotation rotation,place location,10 second,ip different,location ip's,different
114.285714285714
https://styles.redditmedia.com/t5_1my4hf/styles/profileIcon_snooccd52379-a034-4e64-bdf3-fd13affc2669-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=840eed0205ab58ce3282f8734818c65745ad8ccc
iamfromnigeria
iamfromnigeria
26.1100879133989
7597.076171875
4598.89892578125
1
1
2
0.00489
0
0.002597
0
0
386
IamFromNigeria
2/5/2015 11:18:07 PM
0
47
573
0
False
False
False
False
True
False
t2_l6htd
False
False
True
https://styles.redditmedia.com/t5_1my4hf/styles/profileIcon_snooccd52379-a034-4e64-bdf3-fd13affc2669-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=840eed0205ab58ce3282f8734818c65745ad8ccc
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/IamFromNigeria
38
1
1.17647058823529
2
2.35294117647059
0
0
39
45.8823529411765
85
Commented
Commented
webscraping
webscraping
additions octoparse shop website scrap supermart manoapp 245 sub snacks
additions octoparse shop website scrap supermart manoapp 245 sub snacks
sub,category shop,manoapp category,additions manoapp,categories 245,snacks categories,245 supermart,ng additions,additions ng,sub website,anti
sub,category shop,manoapp category,additions manoapp,categories 245,snacks categories,245 supermart,ng additions,additions ng,sub website,anti
100
https://styles.redditmedia.com/t5_3oairv/styles/profileIcon_snoo46c79946-b12e-4cb5-8e46-a92997d3f22b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c7760257ca2ec287f29eb7d775fcbbb8b0716822
rd_md005
rd_md005
1
7826.81884765625
5264.787109375
2
1
0
0.00326
0
0.002452
0
0
387
rd_md005
1/5/2021 8:08:25 AM
0
1
0
0
False
False
False
False
True
False
t2_1ct31nhi
False
False
False
https://styles.redditmedia.com/t5_3oairv/styles/profileIcon_snoo46c79946-b12e-4cb5-8e46-a92997d3f22b-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=c7760257ca2ec287f29eb7d775fcbbb8b0716822
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/rd_md005
38
2
2.5
1
1.25
0
0
35
43.75
80
Posted
Posted
webscraping
webscraping
octoparse ip using scraping expertise need considering lot hello approaches
octoparse ip using scraping expertise need considering lot hello approaches
ip,rotation scraping,libraries please,answer relatively,web everyone,relatively considering,using octoparse,using ip,list blockage,sites list,ip
ip,rotation scraping,libraries please,answer relatively,web everyone,relatively considering,using octoparse,using ip,list blockage,sites list,ip
100
https://styles.redditmedia.com/t5_c1cdk/styles/profileIcon_snoofd8e0bff-abb8-4501-a899-0e8511bbbfee-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=aaad8b3129bdd19e4cd254c4ea9104e664448fa0
town_girl
town_girl
1
785.981567382813
71.2179489135742
1
1
0
0.024
1E-06
0.002144
0
1
388
town_girl
12/12/2016 7:12:24 PM
0
58
1499
0
False
False
False
False
True
False
t2_13in1j
False
False
False
https://styles.redditmedia.com/t5_c1cdk/styles/profileIcon_snoofd8e0bff-abb8-4501-a899-0e8511bbbfee-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=aaad8b3129bdd19e4cd254c4ea9104e664448fa0
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/town_girl
2
2
1.12359550561798
2
1.12359550561798
0
0
74
41.5730337078652
178
Commented
Commented
learnprogramming
learnprogramming
css find td ' help child siblings tags practice index
css find td ' help child siblings tags practice index
practice,css cssselector,option recommend,web adjacent,siblings find,index god,sibling match,once second,know another,td position,practice
practice,css cssselector,option recommend,web adjacent,siblings find,index god,sibling match,once second,know another,td position,practice
100
https://styles.redditmedia.com/t5_6wschw/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180ODk0Njc0_rare_a58fea0a-9060-4ca4-aca6-d9d1a28d1892-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5aa5fe083c8cc65cd9c47bac189e234ef6765a31
carrythen0thing
carrythen0thing
1
4996.1669921875
9717.8095703125
1
1
0
0.012868
0
0.002144
0
1
389
carrythen0thing
8/21/2022 1:55:28 PM
0
83
3476
0
False
False
False
False
True
False
t2_rr3bndci
False
False
False
https://styles.redditmedia.com/t5_6wschw/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180ODk0Njc0_rare_a58fea0a-9060-4ca4-aca6-d9d1a28d1892-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5aa5fe083c8cc65cd9c47bac189e234ef6765a31
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/carrythen0thing
8
2
7.14285714285714
0
0
0
0
12
42.8571428571429
28
Commented
Commented
selfhosted
selfhosted
huginn archivebox io maybe features work github changedetection important huginn#readme
huginn archivebox io maybe features work github changedetection important huginn#readme
important,changedetection archivebox,archivebox huginn,huginn#readme work,maybe huginn,github archivebox,io io,huginn changedetection,io maybe,archivebox features,important
important,changedetection archivebox,archivebox huginn,huginn#readme work,maybe huginn,github archivebox,io io,huginn changedetection,io maybe,archivebox features,important
100
https://styles.redditmedia.com/t5_2k243v/styles/profileIcon_snoo8b8305ef-b125-4c37-82bc-4ee37945297e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=88ced17aeb80997ead68bff25c34e5e916795ec2
cbarker151
cbarker151
1
5183.701171875
7050.5771484375
1
1
0
0.012868
0
0.002144
0
1
390
cbarker151
4/7/2020 8:59:55 PM
0
64
876
0
False
False
False
False
True
False
t2_65gpkdx8
False
False
False
https://styles.redditmedia.com/t5_2k243v/styles/profileIcon_snoo8b8305ef-b125-4c37-82bc-4ee37945297e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=88ced17aeb80997ead68bff25c34e5e916795ec2
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/cbarker151
8
5
6.57894736842105
0
0
0
0
32
42.1052631578947
76
Commented RepliedTo
Commented RepliedTo
selfhosted
selfhosted
pretty sure selenium node know red exactly something scraping specific
sure selenium exactly something scraping specific general handle write personally
node,red pretty,successfully ok,scrapy sure,selenium pagination,ok scraping,general handle,pagination pretty,easily use,node write,csv
pretty,successfully ok,scrapy sure,selenium pagination,ok scraping,general handle,pagination pretty,easily use,node write,csv easily,exactly
100
https://styles.redditmedia.com/t5_7h3wo3/styles/profileIcon_snoo3b4bc251-11ee-4dff-9666-f7bc3e6e103e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=75ea1de76a5972cc2a80e079719badb93cace059
square_lawfulness_33
square_lawfulness_33
1
5487.65673828125
9587.6298828125
1
1
0
0.012868
0
0.002144
0
1
391
Square_Lawfulness_33
11/28/2022 7:21:33 PM
0
1
329
0
False
False
False
False
True
False
t2_tkcxmyrm
False
False
False
https://styles.redditmedia.com/t5_7h3wo3/styles/profileIcon_snoo3b4bc251-11ee-4dff-9666-f7bc3e6e103e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=75ea1de76a5972cc2a80e079719badb93cace059
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Square_Lawfulness_33
8
1
3.44827586206897
0
0
0
0
19
65.5172413793103
29
Commented RepliedTo
Commented RepliedTo
selfhosted
selfhosted
zillow python html parser protocol selenium programming runner language needed
zillow python html parser protocol selenium programming runner language needed
zillow,zillow runner,needed python,programming programming,language protocol,caller requests,protocol selenium,javascript html,parser caller,selenium zillow,already
zillow,zillow runner,needed python,programming programming,language protocol,caller requests,protocol selenium,javascript html,parser caller,selenium zillow,already
100
https://styles.redditmedia.com/t5_6jqsz/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV85NTAyODE_rare_edef19d6-8520-444b-982d-886892126b7f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=121d12a3535721dbd109c69503d67c911ddee309
cellerich
cellerich
1
5435.59033203125
7260.4755859375
1
1
0
0.012868
0
0.002144
0
1
392
cellerich
9/24/2017 6:38:48 PM
0
56
135
0
False
False
False
False
True
False
t2_exp09d5
False
False
False
https://styles.redditmedia.com/t5_6jqsz/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV85NTAyODE_rare_edef19d6-8520-444b-982d-886892126b7f-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=121d12a3535721dbd109c69503d67c911ddee309
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/cellerich
8
0
0
0
0
0
0
6
60
10
Commented
Commented
selfhosted
selfhosted
channel look webscraping loads infos youtube
channel look webscraping loads infos youtube
infos,webscraping loads,infos channel,loads look,youtube youtube,channel
infos,webscraping loads,infos channel,loads look,youtube youtube,channel
100
https://styles.redditmedia.com/t5_2rzxwo/styles/profileIcon_snoo91b9c953-4b0f-4b30-bc1a-016516dc1e96-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4f8d6524267757f27b32c8baf5c1fa78f37d2fc6
codecarter
codecarter
1
5248.06640625
9927.7822265625
1
2
0
0.012868
1E-06
0.002318
0
0
393
codecarter
6/19/2020 2:39:32 AM
0
28
43
0
False
False
False
False
True
False
t2_65frqiwk
False
False
False
https://styles.redditmedia.com/t5_2rzxwo/styles/profileIcon_snoo91b9c953-4b0f-4b30-bc1a-016516dc1e96-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4f8d6524267757f27b32c8baf5c1fa78f37d2fc6
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/codecarter
8
3
3.7037037037037
0
0
0
0
39
48.1481481481481
81
Commented RepliedTo
Commented RepliedTo
selfhosted
selfhosted
trilium web clipper github instance images zadam plugin same words
trilium web clipper github images zadam plugin same words formatting
web,clipper github,zadam zadam,trilium exe,well well,windows words,images same,linux scrape,words notes,github whole,images
web,clipper github,zadam zadam,trilium exe,well well,windows words,images same,linux scrape,words notes,github whole,images
100
https://styles.redditmedia.com/t5_1j7ka2/styles/profileIcon_snoo6433cf67-53b3-4ea5-a0f1-1f83f0b412ef-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=914b2cbc828c4c570e8bac2a2faaf19897404813
hieudt
hieudt
1
4808.294921875
8149.94677734375
1
1
0
0.012868
0
0.002144
0
1
394
hieudt
8/22/2015 10:19:42 AM
0
8
7
0
False
False
False
False
True
False
t2_prhob
False
False
False
https://styles.redditmedia.com/t5_1j7ka2/styles/profileIcon_snoo6433cf67-53b3-4ea5-a0f1-1f83f0b412ef-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=914b2cbc828c4c570e8bac2a2faaf19897404813
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/hieudt
8
1
5.26315789473684
0
0
0
0
12
63.1578947368421
19
Commented
Commented
selfhosted
selfhosted
crawlab team github promising project tried haven't looks
crawlab team github promising project tried haven't looks
crawlab,team team,crawlab github,crawlab crawlab,github haven't,tried looks,promising project,github tried,project crawlab,looks
crawlab,team team,crawlab github,crawlab crawlab,github haven't,tried looks,promising project,github tried,project crawlab,looks
128.571428571429
https://styles.redditmedia.com/t5_50qg8n/styles/profileIcon_snoo3dc291bc-cc12-4b41-b80d-83262a392c24-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=47a10d4a42fca4d22d0f0ed55b9341be0cde188c
cool-pineapple-123
cool-pineapple-123
51.2201758267979
7257.36962890625
6484.15673828125
3
2
4
0.005501
0
0.002732
0
0.5
395
Cool-Pineapple-123
9/9/2021 8:06:20 PM
0
28
33
0
False
False
False
False
True
False
t2_dmo4a50i
False
False
False
https://styles.redditmedia.com/t5_50qg8n/styles/profileIcon_snoo3dc291bc-cc12-4b41-b80d-83262a392c24-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=47a10d4a42fca4d22d0f0ed55b9341be0cde188c
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Cool-Pineapple-123
25
7
4.40251572327044
2
1.25786163522013
0
0
52
32.7044025157233
159
Posted RepliedTo
RepliedTo Posted
learnpython scrapy
scrapy learnpython
open scrapestorm octoparse gt scrape allow free linkedin guys hey
gt account content see open possible tools way publicly profile
using,octoparse hey,guys tried,using free,paid octoparse,scrapestorm open,section paid,allow content,along account,posted guys,know
content,along account,posted guys,know tool,trick activity,gt see,content open,account posted,tried go,linkedin linkedin,profile
100
https://styles.redditmedia.com/t5_19oc6f/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYmZkNjcwNjY3MDUzZTUxN2E5N2FmZTU2YzkxZTRmODNmMTE2MGJkM184OTU4_rare_80a918e6-ac1e-4c59-adab-b58d4dd3e7c9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0d89570361f046c8fadb5d75bcf182a109b8aad6
devnull10
devnull10
1
7452.560546875
6950.87158203125
1
1
0
0.003667
0
0.00221
0
1
396
devnull10
12/2/2016 4:19:47 PM
0
343
4550
0
False
False
False
False
True
False
t2_139ckb
False
False
False
https://styles.redditmedia.com/t5_19oc6f/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYmZkNjcwNjY3MDUzZTUxN2E5N2FmZTU2YzkxZTRmODNmMTE2MGJkM184OTU4_rare_80a918e6-ac1e-4c59-adab-b58d4dd3e7c9-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0d89570361f046c8fadb5d75bcf182a109b8aad6
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/devnull10
25
0
0
0
0
0
0
5
45.4545454545455
11
Commented
Commented
learnpython
learnpython
possible profile visible public selenium
possible profile visible public selenium
possible,selenium visible,profile selenium,public public,visible
possible,selenium visible,profile selenium,public public,visible
128.571428571429
https://styles.redditmedia.com/t5_9nash/styles/profileIcon_snoo1f4c4506-ed2d-41e7-809e-b475b87f439b-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0c2a3c6c00e7313ea5c2a8f6487352ef13ff9c25
drivenkey
drivenkey
51.2201758267979
7060.7099609375
6018.880859375
1
2
4
0.005501
0
0.002549
0
0.5
397
drivenkey
12/21/2017 9:05:24 AM
0
95
76
0
False
False
False
False
True
False
t2_2lan239
False
False
False
https://styles.redditmedia.com/t5_9nash/styles/profileIcon_snoo1f4c4506-ed2d-41e7-809e-b475b87f439b-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=0c2a3c6c00e7313ea5c2a8f6487352ef13ff9c25
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/drivenkey
25
5
6.49350649350649
4
5.19480519480519
0
0
28
36.3636363636364
77
RepliedTo Commented
Commented RepliedTo
learnpython
learnpython
yes worked well email broke engineer broken mission something time
email broke engineer broken mission something time pain point few
worked,well well,broken point,headless long,time try,fix yes,custom email,matching custom,code work,busy given,wasnt
well,broken point,headless long,time try,fix yes,custom email,matching custom,code work,busy given,wasnt broke,engineer
100
https://styles.redditmedia.com/t5_4oo643/styles/profileIcon_snooea498150-9640-4d74-90aa-63860c453563-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d586086c8e27b2a555a38f569411541af3ba3f7f
navagile
navagile
1
6862.6396484375
5555
1
1
0
0.003667
0
0.002264
0
1
398
NavaGile
6/29/2021 12:17:42 PM
0
10
136
0
False
False
False
False
True
False
t2_d02z7scn
False
False
False
https://styles.redditmedia.com/t5_4oo643/styles/profileIcon_snooea498150-9640-4d74-90aa-63860c453563-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=d586086c8e27b2a555a38f569411541af3ba3f7f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/NavaGile
25
0
0
0
0
0
0
2
40
5
RepliedTo
RepliedTo
learnpython
learnpython
fix find
fix find
find,fix
find,fix
100
https://styles.redditmedia.com/t5_76ag6s/styles/profileIcon_snoo257dbbe1-7d0d-4264-9483-86ce33db395e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8d449fc367bdc1d82f42d3af64866f119a1e4779
apolloismydog29
apolloismydog29
1
9374.9892578125
3092.6396484375
0
1
0
0.002445
0
0.002269
0
0
399
ApolloIsMyDog29
10/9/2022 4:45:35 PM
0
2
229
0
False
False
False
False
True
False
t2_t7pyjz35
False
False
False
https://styles.redditmedia.com/t5_76ag6s/styles/profileIcon_snoo257dbbe1-7d0d-4264-9483-86ce33db395e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=8d449fc367bdc1d82f42d3af64866f119a1e4779
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ApolloIsMyDog29
49
1
11.1111111111111
0
0
0
0
5
55.5555555555556
9
Commented
Commented
SEO
SEO
people going great help list lot
people going great help list lot
great,list lot,people going,help list,going help,lot
great,list lot,people going,help list,going help,lot
100
https://styles.redditmedia.com/t5_2at466/styles/profileIcon_96nc3bv61ci81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=9d2f2ac10b4471a85b3154b9bebbdb5487494eb8
ranaanshul
ranaanshul
1
9374.9892578125
2569.1875
2
1
0
0.002445
0
0.002609
0
0
400
ranaanshul
12/19/2019 4:05:02 PM
0
161
47
0
False
False
False
False
True
False
t2_59c84gj9
False
False
False
https://styles.redditmedia.com/t5_2at466/styles/profileIcon_96nc3bv61ci81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=9d2f2ac10b4471a85b3154b9bebbdb5487494eb8
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ranaanshul
49
11
9.73451327433628
2
1.76991150442478
0
0
62
54.8672566371681
113
Posted
Posted
SEO
SEO
clusters still good solid features seo large cloud screamingfrog awesome
clusters still good solid features seo large cloud screamingfrog awesome
clusters,clusters data,faqs crawling,large semrush,classics entities,seranking faqs,nightwatch octoparse,good analysis,jetoctopus sheets,still good,bad
clusters,clusters data,faqs crawling,large semrush,classics entities,seranking faqs,nightwatch octoparse,good analysis,jetoctopus sheets,still good,bad
100
https://styles.redditmedia.com/t5_9oooi/styles/profileIcon_snoo66bcf7b6-0b70-4f74-bae1-9f601b250af3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=130e65fd9449c8301e07cc3aa95ce08fdf83692f
rapid1898
rapid1898
1
8968.8623046875
2569.1875
0
1
0
0.002445
0
0.002439
0
0
401
Rapid1898
12/22/2017 9:43:14 AM
0
205
63
0
False
False
False
False
True
False
t2_fgn4kkw
False
False
False
https://styles.redditmedia.com/t5_9oooi/styles/profileIcon_snoo66bcf7b6-0b70-4f74-bae1-9f601b250af3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=130e65fd9449c8301e07cc3aa95ce08fdf83692f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Rapid1898
48
0
0
0
0
0
0
1
16.6666666666667
6
RepliedTo
RepliedTo
webscraping
webscraping
purchased
purchased
100
https://styles.redditmedia.com/t5_4w4163/styles/profileIcon_snoo3e01e0c0-410a-4353-9ab2-943d39ee9dba-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=defc1a0d830a9d76f5a1b259011ade424e3a0fc4
lower-imagination655
lower-imagination655
1
8968.8623046875
3092.6396484375
1
0
0
0.002445
0
0.002439
0
0
402
Lower-Imagination655
8/12/2021 5:03:05 PM
0
3793
144
0
False
False
False
False
True
False
t2_dvj94c3j
False
False
False
https://styles.redditmedia.com/t5_4w4163/styles/profileIcon_snoo3e01e0c0-410a-4353-9ab2-943d39ee9dba-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=defc1a0d830a9d76f5a1b259011ade424e3a0fc4
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Lower-Imagination655
48
100
https://styles.redditmedia.com/t5_fqhs9/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N181MDM1OTE3_rare_f9b42b4a-accd-456b-a731-f86ccdad0ef3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f0e9602da3d2273fb89cb4a8d0ac1a979497bf47
mih4elll
mih4elll
1
1767.21984863281
8935.3603515625
1
1
0
0
0
0.002439
0
0
403
mih4elll
2/12/2018 10:00:18 PM
0
1
107
0
False
False
False
False
True
False
t2_vqy66al
False
False
False
https://styles.redditmedia.com/t5_fqhs9/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N181MDM1OTE3_rare_f9b42b4a-accd-456b-a731-f86ccdad0ef3-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=f0e9602da3d2273fb89cb4a8d0ac1a979497bf47
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/mih4elll
1
0
0
0
0
0
0
0
0
0
RepliedTo
RepliedTo
webscraping
webscraping
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
spectrosdog
spectrosdog
1
5510.91943359375
4456.30712890625
1
1
0
0.008693
0
0.00216
0
1
404
spectrosdog
8/10/2021 12:53:49 AM
0
67
63
0
False
False
False
False
True
False
t2_dtjj95t7
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/spectrosdog
11
8
6.25
3
2.34375
0
0
46
35.9375
128
Commented RepliedTo
Commented RepliedTo
MDEnts
MDEnts
those one batch yea good cool strains check unfortunately something
yea good cool strains check unfortunately something popular seem thinking
popular,right comparing,strains those,websites elixers,one one,comes dixie,elixers thinking,one something,always super,popular mess,unfortunately
popular,right comparing,strains those,websites elixers,one one,comes dixie,elixers thinking,one something,always super,popular mess,unfortunately
442.857142857143
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
accidentaldavid
accidentaldavid
603.642109921575
5340.685546875
3700.23510742188
6
6
48
0.014225
0
0.00346
0
1
405
AccidentalDavid
6/17/2020 3:21:31 PM
0
5345
1348
0
False
False
False
False
True
False
t2_3z9p3a8l
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/AccidentalDavid
11
15
2.57731958762887
8
1.3745704467354
0
0
298
51.2027491408935
582
RepliedTo Commented Posted
Commented RepliedTo Posted
MDEnts
MDEnts
data dispensaries nbsp cake gelato example sheet working prices discount
nbsp dispensaries cake gelato prices discount deals data example project
gelato,cake tracking,dispensaries data,table current,functionality errors,currently updates,hours cresco,nurse data,current cake,updates jackie,llr
gelato,cake tracking,dispensaries cresco,nurse jackie,llr cake,flower nurse,jackie flower,5g prices,x data,table current,functionality
285.714285714286
https://styles.redditmedia.com/t5_2gtwq7/styles/profileIcon_lb6rlascp4n51.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=8fc37c04572b0d03363eaf21fcc5c61de2b7729a
therustycarr
therustycarr
327.431142874186
5054.23681640625
3530.8291015625
3
2
26
0.011177
0
0.002824
0
0.666666666666667
406
therustycarr
3/3/2020 12:26:29 AM
0
2428
7924
0
False
False
False
False
True
False
t2_3fujo1gb
False
False
False
https://styles.redditmedia.com/t5_2gtwq7/styles/profileIcon_lb6rlascp4n51.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=8fc37c04572b0d03363eaf21fcc5c61de2b7729a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/therustycarr
11
8
3.36134453781513
2
0.840336134453782
0
0
107
44.9579831932773
238
RepliedTo Commented
Commented RepliedTo
MDEnts
MDEnts
points earn purchase earned value point price 8th column 10
earned 8th column 10 determine prices dollar point purchase value
earn,points value,points earned,point many,points prices,earn points,earned 20,points one,gives hand,see points,cheaper
prices,earn points,earned earn,points value,points earned,point many,points 20,points one,gives hand,see points,cheaper
100
https://styles.redditmedia.com/t5_yo46v/styles/profileIcon_snoo56110d51-b0a3-4063-a681-e79d3d3690be-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4a1436be01d6e9ef6b5c0889933b5a6a16ab00a3
aheckingoodboi
aheckingoodboi
1
4861.03857421875
2905.5556640625
1
1
0
0.007451
0
0.002214
0
1
407
aHeckinGoodBoi
3/24/2019 4:33:26 PM
0
6105
2038
0
False
False
False
False
True
False
t2_36k4w8oq
False
False
False
https://styles.redditmedia.com/t5_yo46v/styles/profileIcon_snoo56110d51-b0a3-4063-a681-e79d3d3690be-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=4a1436be01d6e9ef6b5c0889933b5a6a16ab00a3
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/aHeckinGoodBoi
11
4
2.25988700564972
3
1.69491525423729
0
0
71
40.1129943502825
177
RepliedTo
RepliedTo
MDEnts
MDEnts
points price apply website rewards think user websites generally balances
rewards websites price balances unfortunately back used id giving those
those,price points,user automation,script mean,instead price,balances amount,points rewards,points column,generally buyers,points giving,another
those,price points,user automation,script mean,instead price,balances amount,points rewards,points column,generally buyers,points giving,another
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
sciencereplacedgod
sciencereplacedgod
1
4808.294921875
3992.54760742188
0
1
0
0.007451
0
0.002214
0
0
408
ScienceReplacedgod
4/3/2020 1:50:48 AM
0
8210
14703
0
False
False
False
False
True
False
t2_62v5zqqd
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/ScienceReplacedgod
11
0
0
2
5.88235294117647
0
0
16
47.0588235294118
34
RepliedTo
RepliedTo
MDEnts
MDEnts
points gimmick products people getting more time buyer prices discounts
points gimmick products people getting more time buyer prices discounts
believing,getting discounts,points over,time place,buyer trick,people gimmick,trick more,over deal,paying people,believing make,discounts
believing,getting discounts,points over,time place,buyer trick,people gimmick,trick more,over deal,paying people,believing make,discounts
100
https://styles.redditmedia.com/t5_cy5fo/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N183MDg2MjM5_rare_f27b1486-b076-4592-9fd8-508316a95f89-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=61daa522f811e624621b93b6b186b67e5dde8f0a
aggresivepanda
aggresivepanda
1
5617.85888671875
3784.77270507813
1
1
0
0.008693
0
0.00216
0
1
409
aggresivepanda
8/8/2012 2:52:05 AM
0
10
1309
0
False
False
False
False
True
False
t2_8lhyi
False
False
False
https://styles.redditmedia.com/t5_cy5fo/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N183MDg2MjM5_rare_f27b1486-b076-4592-9fd8-508316a95f89-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=61daa522f811e624621b93b6b186b67e5dde8f0a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/aggresivepanda
11
8
5.92592592592593
1
0.740740740740741
0
0
51
37.7777777777778
135
Commented RepliedTo
Commented RepliedTo
MDEnts
MDEnts
lol menu good ie deals iheartjane few lastly fetch think
menu good ie deals iheartjane few lastly fetch think awesome
ie,bogo think,good seen,sub ie,dutchie bro,google idea,make automated,brought saw,edit automation,recognize well,lol
ie,bogo think,good seen,sub ie,dutchie bro,google idea,make automated,brought saw,edit automation,recognize well,lol
100
https://styles.redditmedia.com/t5_lolpm/styles/profileIcon_snooac7d8e11-2032-42cc-9950-0c6a67e94879-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a4c84361045212a0cf8a0e58230eeabc6bc454b7
smokindope94
smokindope94
1
5623.51123046875
2214.87817382813
0
1
0
0.006803
0
0.002261
0
0
410
SmokinDope94
7/12/2018 5:18:18 PM
0
1
101
0
False
False
False
False
True
False
t2_1rcrn01l
False
False
False
https://styles.redditmedia.com/t5_lolpm/styles/profileIcon_snooac7d8e11-2032-42cc-9950-0c6a67e94879-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=a4c84361045212a0cf8a0e58230eeabc6bc454b7
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/SmokinDope94
11
0
0
0
0
0
0
6
46.1538461538462
13
RepliedTo
RepliedTo
MDEnts
MDEnts
baltimore 8th 40 culta close 75
baltimore 8th 40 culta close 75
close,baltimore culta,40 40,8th 75,close 8th,75
close,baltimore culta,40 40,8th 75,close 8th,75
200
https://styles.redditmedia.com/t5_3vy7c4/styles/profileIcon_snooc49619f6-cbf3-4214-b2ca-e5cb31b2a9ff-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=129b9b6ca4a1772105eef83678a2c5e4dc4e7d0d
briandabs
briandabs
176.770615393793
5465.556640625
2890.90234375
2
1
14
0.00978
0
0.002499
0
0.5
411
briandabs
2/2/2021 5:50:27 PM
0
94
1907
0
False
False
False
False
True
False
t2_a4wrv0u7
False
False
False
https://styles.redditmedia.com/t5_3vy7c4/styles/profileIcon_snooc49619f6-cbf3-4214-b2ca-e5cb31b2a9ff-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=129b9b6ca4a1772105eef83678a2c5e4dc4e7d0d
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/briandabs
11
0
0
0
0
0
0
9
64.2857142857143
14
Commented
Commented
MDEnts
MDEnts
days cake certain gelato wil charges 30 55 dispo
days cake certain gelato wil charges 30 55 dispo
certain,days 30,certain 55,gelato gelato,cake wil,30 charges,55 cake,wil dispo,charges
certain,days 30,certain 55,gelato gelato,cake wil,30 charges,55 cake,wil dispo,charges
100
https://styles.redditmedia.com/t5_ctlrq/styles/profileIcon_snooca9cce36-4c68-4148-ba8c-d825df3f19e5-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5fe334cb135dc80c9a474f08afc8c8e93ef76548
karriejan
karriejan
1
5272.2109375
4557.94873046875
1
1
0
0.008693
0
0.00216
0
1
412
karriejan
12/31/2016 3:51:05 AM
0
8961
10633
0
False
False
False
False
True
False
t2_140d1y
False
False
False
https://styles.redditmedia.com/t5_ctlrq/styles/profileIcon_snooca9cce36-4c68-4148-ba8c-d825df3f19e5-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5fe334cb135dc80c9a474f08afc8c8e93ef76548
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/karriejan
11
0
0
0
0
0
0
13
65
20
Commented RepliedTo
Commented RepliedTo
MDEnts
MDEnts
thanks catonsville releaf city ethos ellicott baltimore taking greenhouse shop
catonsville releaf city ethos ellicott baltimore taking greenhouse shop bunch
ellicott,city greenhouse,ellicott please,thanks thanks,bunch releaf,shop thanks,taking ethos,catonsville catonsville,greenhouse shop,baltimore baltimore,please
ellicott,city greenhouse,ellicott please,thanks thanks,bunch releaf,shop thanks,taking ethos,catonsville catonsville,greenhouse shop,baltimore baltimore,please
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
Show
tinygreenbag
tinygreenbag
1
2083.142578125
8935.3603515625
0
0
0
0
0
0
0
0
413
tinygreenbag
8/5/2014 6:55:45 PM
0
52
9546
0
False
False
False
False
True
False
t2_hqlnr
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/tinygreenbag
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
Show
ill-function805
ill-function805
1
819.451293945313
9596.974609375
0
0
0
0
0
0
0
0
414
Ill-Function805
1/22/2021 6:45:16 PM
0
38
28
0
False
False
False
False
True
False
t2_8pgqdx80
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Ill-Function805
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
Show
bingoringo2
bingoringo2
1
1135.37414550781
9596.974609375
0
0
0
0
0
0
0
0
415
BingoRingo2
9/16/2016 1:25:40 AM
0
2143
75112
0
False
False
False
False
True
False
t2_11fujv
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/BingoRingo2
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
Show
norachoi
norachoi
1
187.605651855469
9596.974609375
0
0
0
0
0
0
0
0
416
NoraChoi
6/2/2016 2:40:18 AM
0
1
0
0
False
False
False
False
True
False
t2_ye08d
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_5.png
False
False
False
True
Open Reddit Page for This Person
https://www.reddit.com/user/NoraChoi
1
100
https://styles.redditmedia.com/t5_2iodwz/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8zMjMzMzU_rare_7a21198f-c5f0-4e62-8674-e970c0ae49e1-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3a788e70ad6f9e8a8863106571ba5303c499d892
Show
inchmine
inchmine
1
503.528503417969
9596.974609375
0
0
0
0
0
0
0
0
417
Inchmine
3/25/2020 3:07:16 PM
0
140
943
0
False
False
False
False
True
False
t2_613695fo
False
False
False
https://styles.redditmedia.com/t5_2iodwz/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfNDY2YTMzMDg4N2JkZjYyZDUzZjk2OGVhODI0NzkzMTUwZjA3NzYyZV8zMjMzMzU_rare_7a21198f-c5f0-4e62-8674-e970c0ae49e1-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3a788e70ad6f9e8a8863106571ba5303c499d892
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Inchmine
1
100
https://styles.redditmedia.com/t5_csedr/styles/profileIcon_snooc63e915f-dbd3-4b98-84c1-ea486967223c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=14b47027dcb5cf74ede2226aa98ac584a92eb2e4
Show
edajmay
edajmay
1
2083.142578125
9596.974609375
0
0
0
0
0
0
0
0
418
EdajMay
1/15/2015 11:50:26 PM
0
2830
32475
0
False
False
False
False
True
False
t2_kqnql
False
False
True
https://styles.redditmedia.com/t5_csedr/styles/profileIcon_snooc63e915f-dbd3-4b98-84c1-ea486967223c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=14b47027dcb5cf74ede2226aa98ac584a92eb2e4
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/EdajMay
1
100
https://styles.redditmedia.com/t5_bangb/styles/profileIcon_snoo967f249e-9984-41d8-a7ca-f6bc66f3333c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6ed21f224599198f4751dba979e59e98d7499d6f
Show
kobyof
kobyof
1
187.605651855469
8935.3603515625
0
0
0
0
0
0
0
0
419
kobyof
9/29/2015 2:44:00 PM
0
27527
5377
0
False
False
False
False
True
False
t2_qtbi0
False
False
False
https://styles.redditmedia.com/t5_bangb/styles/profileIcon_snoo967f249e-9984-41d8-a7ca-f6bc66f3333c-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=6ed21f224599198f4751dba979e59e98d7499d6f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/kobyof
1
100
https://styles.redditmedia.com/t5_5i9pq2/styles/profileIcon_lao920ingab81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=c132a5c349877ae57ab5666a4fcd45942b7fbbf0
Show
kuldeep_gera
kuldeep_gera
1
1451.29711914063
9596.974609375
0
0
0
0
0
0
0
0
420
kuldeep_gera
12/17/2021 2:36:27 PM
0
26
134
0
False
False
False
False
True
False
t2_hnruz7yc
False
False
False
https://styles.redditmedia.com/t5_5i9pq2/styles/profileIcon_lao920ingab81.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=c132a5c349877ae57ab5666a4fcd45942b7fbbf0
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/kuldeep_gera
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
Show
hexorg
hexorg
1
1767.21984863281
9596.974609375
0
0
0
0
0
0
0
0
421
Hexorg
1/8/2013 9:31:00 PM
0
8192
84817
0
False
False
False
False
True
False
t2_a5he9
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Hexorg
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
Show
kfitzdor
kfitzdor
1
819.451293945313
8273.7451171875
0
0
0
0
0
0
0
0
422
kfitzdor
11/21/2017 12:25:16 AM
0
2687
259
0
False
False
False
False
True
False
t2_lln0kd0
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_4.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/kfitzdor
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
Show
xdevhunter
xdevhunter
1
2083.142578125
7612.13037109375
0
0
0
0
0
0
0
0
423
XDevHunter
3/24/2021 3:27:41 AM
0
8
0
0
False
False
False
False
True
False
t2_9pe8w1gu
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_6.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/XDevHunter
1
100
https://styles.redditmedia.com/t5_15xkb9/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYjljMDQyYzMyNzViYzQ5Nzk5Njg4ZWVhMWEyOWIxNDA1ZDAyOTQ2Yl85NjAw_rare_e130f886-c71d-44f3-bf64-4441e3d9baf4-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3d98291cff7de4842836248299c08f6672a50072
Show
gruffnutz
gruffnutz
1
187.605651855469
6950.515625
0
0
0
0
0
0
0
0
424
gruffnutz
5/18/2017 1:18:33 PM
0
2977
1232
0
False
False
False
False
True
False
t2_1g6pzpj
False
False
True
https://styles.redditmedia.com/t5_15xkb9/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYjljMDQyYzMyNzViYzQ5Nzk5Njg4ZWVhMWEyOWIxNDA1ZDAyOTQ2Yl85NjAw_rare_e130f886-c71d-44f3-bf64-4441e3d9baf4-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=3d98291cff7de4842836248299c08f6672a50072
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/gruffnutz
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
Show
wifeycong
wifeycong
1
1451.29711914063
7612.13037109375
0
0
0
0
0
0
0
0
425
wifeycong
6/22/2016 5:25:50 PM
0
1
-2
0
False
False
False
False
True
False
t2_yxrcg
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_0.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/wifeycong
1
100
https://styles.redditmedia.com/t5_s8azl/styles/profileIcon_oyx5vwyuj9621.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=7693d46a268b136a2faf2ec4d039dbef4a85a622
Show
zeptobook
zeptobook
1
1767.21984863281
7612.13037109375
0
0
0
0
0
0
0
0
426
zeptobook
12/3/2018 4:59:31 PM
0
774
76
0
False
False
False
False
True
False
t2_18w422j1
False
False
False
https://styles.redditmedia.com/t5_s8azl/styles/profileIcon_oyx5vwyuj9621.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=7693d46a268b136a2faf2ec4d039dbef4a85a622
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/zeptobook
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
Show
tricky-flamingo4283
tricky-flamingo4283
1
1135.37414550781
6950.515625
0
0
0
0
0
0
0
0
427
Tricky-Flamingo4283
2/21/2022 10:12:27 PM
0
8
389
0
False
False
False
False
True
False
t2_jxppyxtz
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Tricky-Flamingo4283
1
100
https://styles.redditmedia.com/t5_52ak94/styles/profileIcon_5hpdy9fpnio71.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=b23c089eaebeda1d85f6a04d69ed33a2ba2a7baf
Show
bruno3869
bruno3869
1
1451.29711914063
6950.515625
0
0
0
0
0
0
0
0
428
Bruno3869
9/19/2021 8:00:04 PM
0
1
0
0
False
False
False
False
True
False
t2_emzv5yea
False
False
False
https://styles.redditmedia.com/t5_52ak94/styles/profileIcon_5hpdy9fpnio71.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=b23c089eaebeda1d85f6a04d69ed33a2ba2a7baf
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Bruno3869
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
Show
gerryjenkinslb
gerryjenkinslb
1
503.528503417969
6950.515625
0
0
0
0
0
0
0
0
429
gerryjenkinslb
10/10/2017 6:02:46 AM
0
1191
36
0
False
False
False
False
True
False
t2_ncw3gj
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_3.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/gerryjenkinslb
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
Show
febreezeontherain
febreezeontherain
1
819.451293945313
6950.515625
0
0
0
0
0
0
0
0
430
febreezeontherain
4/27/2014 10:10:03 AM
0
318
1140
0
False
False
False
False
True
False
t2_gbn15
False
False
True
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_1.png
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/febreezeontherain
1
100
https://styles.redditmedia.com/t5_3l8kl/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180ODg1Njc2_rare_3cead959-b026-4734-bcd6-0914a2d2e8c1-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=180504771958454283beef4e3964ffa2013b23fb
Show
phealthy
phealthy
1
1767.21984863281
8273.7451171875
0
0
0
0
0
0
0
0
431
PHealthy
2/19/2015 1:43:21 PM
0
536826
190937
0
False
False
False
False
True
False
t2_lgnhu
False
True
True
https://styles.redditmedia.com/t5_3l8kl/styles/profileIcon_snoo-nftv2_bmZ0X2VpcDE1NToxMzdfYzhkM2EzYTgzYmRlNWRhZDA2ZDQzNjY5NGUzZTIyYWMzZTY0ZDU3N180ODg1Njc2_rare_3cead959-b026-4734-bcd6-0914a2d2e8c1-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=180504771958454283beef4e3964ffa2013b23fb
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/PHealthy
1
100
https://styles.redditmedia.com/t5_myx2d/styles/profileIcon_dxochlssf2f11.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=473ee57a6076f6ad7e26bed398ad0146eb3669ac
Show
iamplayingdota
iamplayingdota
1
2083.142578125
8273.7451171875
0
0
0
0
0
0
0
0
432
IamPlayingDota
8/9/2018 12:58:32 PM
0
1
20
0
False
False
False
False
True
False
t2_1scdb808
False
False
False
https://styles.redditmedia.com/t5_myx2d/styles/profileIcon_dxochlssf2f11.jpg?width=256&height=256&crop=256:256,smart&v=enabled&s=473ee57a6076f6ad7e26bed398ad0146eb3669ac
False
False
True
False
Open Reddit Page for This Person
https://www.reddit.com/user/IamPlayingDota
1
100
https://styles.redditmedia.com/t5_c7gj6/styles/profileIcon_snoo9dc516a6-f6d2-42dc-860f-d5d59c3775e1-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2243b388a8f2166fbf3a578b2b3bf76968e75ada
Show
psicoguana
psicoguana
1
1135.37414550781
8273.7451171875
0
0
0
0
0
0
0
0
433
Psicoguana
4/4/2015 8:34:46 PM
0
5420
13553
0
False
False
False
False
True
False
t2_mpcdb
False
False
True
https://styles.redditmedia.com/t5_c7gj6/styles/profileIcon_snoo9dc516a6-f6d2-42dc-860f-d5d59c3775e1-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=2243b388a8f2166fbf3a578b2b3bf76968e75ada
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/Psicoguana
1
100
https://styles.redditmedia.com/t5_ymx2o/styles/profileIcon_snoobd5c678f-fff4-454c-83ba-3f5d09f52712-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1295942b20bc59b28229caf3f331a5380c066838
Show
juanmarcadena
juanmarcadena
1
1451.29711914063
8273.7451171875
0
0
0
0
0
0
0
0
434
juanmarcadena
3/23/2019 11:43:55 PM
0
112
109
0
False
False
False
False
True
False
t2_3gvqlpm6
False
False
False
https://styles.redditmedia.com/t5_ymx2o/styles/profileIcon_snoobd5c678f-fff4-454c-83ba-3f5d09f52712-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1295942b20bc59b28229caf3f331a5380c066838
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/juanmarcadena
1
100
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
Show
exrussiandude
exrussiandude
1
819.451293945313
7612.13037109375
0
0
0
0
0
0
0
0
435
exrussiandude
1/1/0001 12:00:00 AM
0
0
0
0
False
False
True
False
False
False
False
False
False
https://www.redditstatic.com/avatars/defaults/v2/avatar_default_2.png
False
False
False
False
Open Reddit Page for This Person
https://www.reddit.com/user/exrussiandude
1
100
https://styles.redditmedia.com/t5_apju4/styles/profileIcon_snoo7e4179ca-28ea-4bf6-a702-5f6bfe7f17c8-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=701e28eb292c6974f62524a606ea5a718f1f8b3a
Show
antx
antx
1
1135.37414550781
7612.13037109375
0
0
0
0
0
0
0
0
436
antx
4/3/2011 8:19:41 PM
0
5568
39880
0
False
False
False
False
True
False
t2_51x35
False
False
True
https://styles.redditmedia.com/t5_apju4/styles/profileIcon_snoo7e4179ca-28ea-4bf6-a702-5f6bfe7f17c8-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=701e28eb292c6974f62524a606ea5a718f1f8b3a
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/antx
1
100
https://styles.redditmedia.com/t5_dfeqs/styles/profileIcon_snooe6b26e0c-6e2e-48fc-b510-426b67a2236e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1352134fae730c846a95bade48bb15955e26b74f
Show
charlesfire
charlesfire
1
187.605651855469
7612.13037109375
0
0
0
0
0
0
0
0
437
charlesfire
12/2/2015 10:01:43 PM
0
7982
51834
0
False
False
False
False
True
False
t2_smplk
False
False
False
https://styles.redditmedia.com/t5_dfeqs/styles/profileIcon_snooe6b26e0c-6e2e-48fc-b510-426b67a2236e-headshot.png?width=256&height=256&crop=256:256,smart&v=enabled&s=1352134fae730c846a95bade48bb15955e26b74f
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/charlesfire
1
100
https://styles.redditmedia.com/t5_7h5b1/styles/profileIcon_snoo20f52ff0-7eac-4ea5-b327-baad77b542ee-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5f244eecec2d4cc391208fcb2cb462f4198fe4b1
Show
taewoo
taewoo
1
503.528503417969
7612.13037109375
0
0
0
0
0
0
0
0
438
taewoo
11/26/2010 5:50:29 AM
0
3216
1241
0
False
False
False
False
True
False
t2_4k37w
False
False
True
https://styles.redditmedia.com/t5_7h5b1/styles/profileIcon_snoo20f52ff0-7eac-4ea5-b327-baad77b542ee-headshot-f.png?width=256&height=256&crop=256:256,smart&v=enabled&s=5f244eecec2d4cc391208fcb2cb462f4198fe4b1
False
False
True
True
Open Reddit Page for This Person
https://www.reddit.com/user/taewoo
1
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
815
Commented
8/19/2021 11:42:11 AM
Ethos in Catonsville, Greenhouse in Ellicott City, Releaf Shop in Baltimore, please! Thanks for taking this on!
h9is57j
MDEnts
karriejan
t1_h9is57j
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9is57j/
8/19/2021 11:42:11 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
1
p6yn2m
False
False
False
0
8
11
11
0
0
0
0
0
0
11
64.7058823529412
17
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
814
RepliedTo
8/19/2021 12:12:16 PM
Added! Ethos, and Releaf do not have stock right now!
h9iv3f5
MDEnts
AccidentalDavid
t1_h9iv3f5
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9iv3f5/
8/19/2021 12:12:16 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9is57j
t1_h9is57j
h9is57j
1
p6yn2m
True
False
False
1
4
11
11
1
10
0
0
0
0
4
40
10
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
813
RepliedTo
8/19/2021 12:34:27 PM
Thanks a bunch!
h9ixfzo
MDEnts
karriejan
t1_h9ixfzo
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9ixfzo/
8/19/2021 12:34:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9iv3f5
t1_h9iv3f5
h9iv3f5
0
p6yn2m
False
False
False
2
8
11
11
0
0
0
0
0
0
2
66.6666666666667
3
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
812
Commented
8/19/2021 11:42:11 AM
Ethos in Catonsville, Greenhouse in Ellicott City, Releaf Shop in Baltimore, please! Thanks for taking this on!
h9is57j
MDEnts
karriejan
t1_h9is57j
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9is57j/
8/19/2021 11:42:11 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
1
p6yn2m
False
False
False
0
8
11
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
811
RepliedTo
8/19/2021 12:12:16 PM
Added! Ethos, and Releaf do not have stock right now!
h9iv3f5
MDEnts
AccidentalDavid
t1_h9iv3f5
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9iv3f5/
8/19/2021 12:12:16 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9is57j
t1_h9is57j
h9is57j
1
p6yn2m
True
False
False
1
4
11
11
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
810
RepliedTo
8/19/2021 12:34:27 PM
Thanks a bunch!
h9ixfzo
MDEnts
karriejan
t1_h9ixfzo
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9ixfzo/
8/19/2021 12:34:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9iv3f5
t1_h9iv3f5
h9iv3f5
0
p6yn2m
False
False
False
2
8
11
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
809
Commented
8/18/2021 11:56:56 PM
My dispo charges $55 for Gelato cake but wil do 30% on certain days
h9gyatu
MDEnts
briandabs
t1_h9gyatu
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gyatu/
8/18/2021 11:56:56 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
2
p6yn2m
False
False
False
0
4
11
11
0
0
0
0
0
0
9
64.2857142857143
14
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
808
RepliedTo
8/19/2021 12:15:40 AM
You should go somewhere that charges $45 with a deal on the same day!
h9h0ows
MDEnts
AccidentalDavid
t1_h9h0ows
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h0ows/
8/19/2021 12:15:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gyatu
t1_h9gyatu
h9gyatu
0
p6yn2m
True
False
False
1
4
11
11
0
0
0
0
0
0
7
50
14
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
807
Commented
8/18/2021 11:56:56 PM
My dispo charges $55 for Gelato cake but wil do 30% on certain days
h9gyatu
MDEnts
briandabs
t1_h9gyatu
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gyatu/
8/18/2021 11:56:56 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
2
p6yn2m
False
False
False
0
4
11
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
806
RepliedTo
8/19/2021 12:15:40 AM
You should go somewhere that charges $45 with a deal on the same day!
h9h0ows
MDEnts
AccidentalDavid
t1_h9h0ows
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h0ows/
8/19/2021 12:15:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gyatu
t1_h9gyatu
h9gyatu
0
p6yn2m
True
False
False
1
4
11
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
805
RepliedTo
8/19/2021 12:27:49 AM
Culta has it $40 8th $75 1/4 if your close to Baltimore
h9h28ku
MDEnts
SmokinDope94
t1_h9h28ku
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h28ku/
8/19/2021 12:27:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gyatu
t1_h9gyatu
h9gyatu
0
p6yn2m
False
False
False
1
4
11
11
0
0
0
0
0
0
6
46.1538461538462
13
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
804
RepliedTo
8/19/2021 12:27:49 AM
Culta has it $40 8th $75 1/4 if your close to Baltimore
h9h28ku
MDEnts
SmokinDope94
t1_h9h28ku
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h28ku/
8/19/2021 12:27:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gyatu
t1_h9gyatu
h9gyatu
0
p6yn2m
False
False
False
1
4
11
11
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
803
Commented
8/18/2021 11:47:01 PM
This is awesome!! However, it’s going to be though to get this automated as you brought up some good points with difficulties… some menus paste their deals at the top of their menu w/o it being reflected on the price, bundle deals (ie bogo 50% off) would have to thought through, and lastly there are a few menu services (ie Dutchie, Weedmaps, leafly, IheartJane, etc) so the automation would have to recognize and be able to fetch data from each platform. Makes me think this a good business idea 🤔 who want to make a website?? Lol
h9gx2gx
MDEnts
aggresivepanda
t1_h9gx2gx
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gx2gx/
8/18/2021 11:47:01 PM
8/18/2021 11:50:50 PM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
1
p6yn2m
False
False
False
0
8
11
11
4
4.08163265306122
1
1.02040816326531
0
0
40
40.8163265306122
98
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
802
RepliedTo
8/19/2021 12:14:58 AM
Already made a website 😝
h9h0loi
MDEnts
AccidentalDavid
t1_h9h0loi
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h0loi/
8/19/2021 12:14:58 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gx2gx
t1_h9gx2gx
h9gx2gx
1
p6yn2m
True
False
False
1
4
11
11
0
0
0
0
0
0
3
75
4
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
801
RepliedTo
8/19/2021 12:42:47 AM
I was like bro.. that’s a google sheet and then saw your edit 2 lol Right on my man!! This is some of the best work I’ve seen from this sub in well.. ever lol
h9h463c
MDEnts
aggresivepanda
t1_h9h463c
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h463c/
8/19/2021 12:42:47 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9h0loi
t1_h9h0loi
h9h0loi
0
p6yn2m
False
False
False
2
8
11
11
4
10.8108108108108
0
0
0
0
11
29.7297297297297
37
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
800
Commented
8/18/2021 11:47:01 PM
This is awesome!! However, it’s going to be though to get this automated as you brought up some good points with difficulties… some menus paste their deals at the top of their menu w/o it being reflected on the price, bundle deals (ie bogo 50% off) would have to thought through, and lastly there are a few menu services (ie Dutchie, Weedmaps, leafly, IheartJane, etc) so the automation would have to recognize and be able to fetch data from each platform. Makes me think this a good business idea 🤔 who want to make a website?? Lol
h9gx2gx
MDEnts
aggresivepanda
t1_h9gx2gx
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gx2gx/
8/18/2021 11:47:01 PM
8/18/2021 11:50:50 PM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
1
p6yn2m
False
False
False
0
8
11
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
799
RepliedTo
8/19/2021 12:14:58 AM
Already made a website 😝
h9h0loi
MDEnts
AccidentalDavid
t1_h9h0loi
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h0loi/
8/19/2021 12:14:58 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gx2gx
t1_h9gx2gx
h9gx2gx
1
p6yn2m
True
False
False
1
4
11
11
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
798
RepliedTo
8/19/2021 12:42:47 AM
I was like bro.. that’s a google sheet and then saw your edit 2 lol Right on my man!! This is some of the best work I’ve seen from this sub in well.. ever lol
h9h463c
MDEnts
aggresivepanda
t1_h9h463c
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h463c/
8/19/2021 12:42:47 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9h0loi
t1_h9h0loi
h9h0loi
0
p6yn2m
False
False
False
2
8
11
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
791
RepliedTo
8/19/2021 1:51:23 PM
If a place has buyer points, the products prices make up for any discounts.
Points are a gimmick to trick people into believing that they are getting a deal after paying more over time.
h9j6rtp
MDEnts
ScienceReplacedgod
t1_h9j6rtp
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j6rtp/
8/19/2021 1:51:23 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gtdv3
t1_h9gtdv3
h9gtdv3
0
p6yn2m
False
False
False
1
4
11
11
0
0
2
5.88235294117647
0
0
16
47.0588235294118
34
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
790
RepliedTo
8/19/2021 1:51:23 PM
If a place has buyer points, the products prices make up for any discounts.
Points are a gimmick to trick people into believing that they are getting a deal after paying more over time.
h9j6rtp
MDEnts
ScienceReplacedgod
t1_h9j6rtp
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j6rtp/
8/19/2021 1:51:23 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gtdv3
t1_h9gtdv3
h9gtdv3
0
p6yn2m
False
False
False
1
4
11
11
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
789
RepliedTo
8/19/2021 12:36:48 PM
I think the idea here is to scrape for the base price you can get these products for. If you have your own personal buyers points, you can apply those to the price after the fact.
h9ixpea
MDEnts
aHeckinGoodBoi
t1_h9ixpea
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9ixpea/
8/19/2021 12:36:48 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9h3mvn
t1_h9h3mvn
h9h3mvn
1
p6yn2m
False
False
False
3
36
11
11
0
0
0
0
0
0
13
36.1111111111111
36
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
788
RepliedTo
8/19/2021 1:04:43 PM
Fine, but given the same price at different dispensaries, I'd choose the one that gives more valuable points. The problem comes in when you compare sales prices that earn no points with regular prices that earn points. Are the points enough to make a difference? Who knows?
h9j0wnt
MDEnts
therustycarr
t1_h9j0wnt
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j0wnt/
8/19/2021 1:04:43 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9ixpea
t1_h9ixpea
h9ixpea
1
p6yn2m
False
False
False
4
36
11
11
3
6.38297872340426
1
2.12765957446809
0
0
24
51.063829787234
47
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
787
RepliedTo
8/19/2021 1:09:06 PM
You're right, the problem here is that generally these websites don't offer an API that would allow you to send it your user id and password, and return back account info like rewards points. If they did, I think this would be possibly feasible.
The only way to implement this would be write an automation script that would sign into the websites and snatch the rewards point indicator off of it to apply it to the price balances of each product. Unfortunately every website is different and everchanging, so it would break fairly fast unless constant maintenance was done.
h9j1fl9
MDEnts
aHeckinGoodBoi
t1_h9j1fl9
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j1fl9/
8/19/2021 1:09:06 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9j0wnt
t1_h9j0wnt
h9j0wnt
1
p6yn2m
False
False
False
5
36
11
11
4
4.04040404040404
3
3.03030303030303
0
0
40
40.4040404040404
99
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
786
RepliedTo
8/19/2021 1:32:26 PM
No. It's the points you earn from the purchase that reduce the effective price of the purchase. You don't need to know how many points the buyer has because that is just using credit that has already been earned.
However, to your point (and my original point), determining the value of points at each dispensary programmatically is problematic. Still, it might be instructive to look at a few examples by hand and see what you get.
h9j4bni
MDEnts
therustycarr
t1_h9j4bni
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j4bni/
8/19/2021 1:32:26 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9j1fl9
t1_h9j1fl9
h9j1fl9
1
p6yn2m
False
False
False
6
36
11
11
2
2.63157894736842
1
1.31578947368421
0
0
30
39.4736842105263
76
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
785
RepliedTo
8/19/2021 1:35:37 PM
so you mean instead of importing the amount of points the user has, giving a another column to generally give a pointer out to any discounts the website might have, e.g. what the discount would be if you used member points?
h9j4q2p
MDEnts
aHeckinGoodBoi
t1_h9j4q2p
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j4q2p/
8/19/2021 1:35:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9j4bni
t1_h9j4bni
h9j4bni
1
p6yn2m
False
False
False
7
36
11
11
0
0
0
0
0
0
18
42.8571428571429
42
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
784
RepliedTo
8/19/2021 2:51:48 PM
Hopefully, there's an algorithm to determine how many points are earned and another to determine the value of points so that extra column would reflect the total dollar value of the points that would be earned by the purchase,
So if every dollar spent earned a point and 100 points was good for $10 off, then the points column would be 10% of the menu price. A $60 8th on sale for 20% off with no points is $48. A $50 8th with the above points system would earn $5 worth of points. Which is cheaper?
h9jf5mv
MDEnts
therustycarr
t1_h9jf5mv
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9jf5mv/
8/19/2021 2:51:48 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9j4q2p
t1_h9j4q2p
h9j4q2p
0
p6yn2m
False
False
False
8
36
11
11
3
3.125
0
0
0
0
44
45.8333333333333
96
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
783
RepliedTo
8/19/2021 12:36:48 PM
I think the idea here is to scrape for the base price you can get these products for. If you have your own personal buyers points, you can apply those to the price after the fact.
h9ixpea
MDEnts
aHeckinGoodBoi
t1_h9ixpea
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9ixpea/
8/19/2021 12:36:48 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9h3mvn
t1_h9h3mvn
h9h3mvn
1
p6yn2m
False
False
False
3
36
11
11
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
782
RepliedTo
8/19/2021 1:04:43 PM
Fine, but given the same price at different dispensaries, I'd choose the one that gives more valuable points. The problem comes in when you compare sales prices that earn no points with regular prices that earn points. Are the points enough to make a difference? Who knows?
h9j0wnt
MDEnts
therustycarr
t1_h9j0wnt
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j0wnt/
8/19/2021 1:04:43 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9ixpea
t1_h9ixpea
h9ixpea
1
p6yn2m
False
False
False
4
36
11
11
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
781
RepliedTo
8/19/2021 1:09:06 PM
You're right, the problem here is that generally these websites don't offer an API that would allow you to send it your user id and password, and return back account info like rewards points. If they did, I think this would be possibly feasible.
The only way to implement this would be write an automation script that would sign into the websites and snatch the rewards point indicator off of it to apply it to the price balances of each product. Unfortunately every website is different and everchanging, so it would break fairly fast unless constant maintenance was done.
h9j1fl9
MDEnts
aHeckinGoodBoi
t1_h9j1fl9
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j1fl9/
8/19/2021 1:09:06 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9j0wnt
t1_h9j0wnt
h9j0wnt
1
p6yn2m
False
False
False
5
36
11
11
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
780
RepliedTo
8/19/2021 1:32:26 PM
No. It's the points you earn from the purchase that reduce the effective price of the purchase. You don't need to know how many points the buyer has because that is just using credit that has already been earned.
However, to your point (and my original point), determining the value of points at each dispensary programmatically is problematic. Still, it might be instructive to look at a few examples by hand and see what you get.
h9j4bni
MDEnts
therustycarr
t1_h9j4bni
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j4bni/
8/19/2021 1:32:26 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9j1fl9
t1_h9j1fl9
h9j1fl9
1
p6yn2m
False
False
False
6
36
11
11
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
779
RepliedTo
8/19/2021 1:35:37 PM
so you mean instead of importing the amount of points the user has, giving a another column to generally give a pointer out to any discounts the website might have, e.g. what the discount would be if you used member points?
h9j4q2p
MDEnts
aHeckinGoodBoi
t1_h9j4q2p
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9j4q2p/
8/19/2021 1:35:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9j4bni
t1_h9j4bni
h9j4bni
1
p6yn2m
False
False
False
7
36
11
11
128, 128, 128
3.03311258278146
Dash Dot Dot
49.8580889309366
Yes
778
RepliedTo
8/19/2021 2:51:48 PM
Hopefully, there's an algorithm to determine how many points are earned and another to determine the value of points so that extra column would reflect the total dollar value of the points that would be earned by the purchase,
So if every dollar spent earned a point and 100 points was good for $10 off, then the points column would be 10% of the menu price. A $60 8th on sale for 20% off with no points is $48. A $50 8th with the above points system would earn $5 worth of points. Which is cheaper?
h9jf5mv
MDEnts
therustycarr
t1_h9jf5mv
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9jf5mv/
8/19/2021 2:51:48 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9j4q2p
t1_h9j4q2p
h9j4q2p
0
p6yn2m
False
False
False
8
36
11
11
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
797
Commented
8/18/2021 11:17:37 PM
How are you going to account for frequent buyer points?
h9gtdv3
MDEnts
therustycarr
t1_h9gtdv3
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gtdv3/
8/18/2021 11:17:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
2
p6yn2m
False
False
False
0
8
11
11
0
0
0
0
0
0
5
50
10
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
796
RepliedTo
8/19/2021 12:16:32 AM
I am not sure how that would be relevant since that is something you apply to your own purchase.
h9h0svh
MDEnts
AccidentalDavid
t1_h9h0svh
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h0svh/
8/19/2021 12:16:32 AM
1/1/0001 12:00:00 AM
False
False
4
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gtdv3
t1_h9gtdv3
h9gtdv3
1
p6yn2m
True
False
False
1
4
11
11
0
0
0
0
0
0
5
26.3157894736842
19
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
795
RepliedTo
8/19/2021 12:38:41 AM
You earn points with a purchase. That has value.
h9h3mvn
MDEnts
therustycarr
t1_h9h3mvn
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h3mvn/
8/19/2021 12:38:41 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9h0svh
t1_h9h0svh
h9h0svh
1
p6yn2m
False
False
False
2
8
11
11
0
0
0
0
0
0
4
44.4444444444444
9
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
794
Commented
8/18/2021 11:17:37 PM
How are you going to account for frequent buyer points?
h9gtdv3
MDEnts
therustycarr
t1_h9gtdv3
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gtdv3/
8/18/2021 11:17:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
2
p6yn2m
False
False
False
0
8
11
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
793
RepliedTo
8/19/2021 12:16:32 AM
I am not sure how that would be relevant since that is something you apply to your own purchase.
h9h0svh
MDEnts
AccidentalDavid
t1_h9h0svh
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h0svh/
8/19/2021 12:16:32 AM
1/1/0001 12:00:00 AM
False
False
6
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gtdv3
t1_h9gtdv3
h9gtdv3
1
p6yn2m
True
False
False
1
4
11
11
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
792
RepliedTo
8/19/2021 12:38:41 AM
You earn points with a purchase. That has value.
h9h3mvn
MDEnts
therustycarr
t1_h9h3mvn
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h3mvn/
8/19/2021 12:38:41 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9h0svh
t1_h9h0svh
h9h0svh
1
p6yn2m
False
False
False
2
8
11
11
128, 128, 128
3.01797540208136
Dash Dot Dot
49.9229625625084
No
821
Posted
8/18/2021 7:45:03 PM
Hey Everybody!
I am in the process of setting up a web scraping system using Octoparse to start scraping the prices of popular strains here in MD.
I have started this project because the amount of price discrepancy I see when browsing weedmaps, leafly, and direct dispensary sites. Many times dispensaries are running seemingly great deals but after research you will find they are selling their products $5 more than everybody else, just recently changed their prices, or it is still $10+ a gram.
 
For example you may see at Curaleaf in Gaithersburg they are doing a deal today 30% off all vendors!! Wow you think that is crazy.... well let's use the ever beloved District Cannabis Gelato Cake for an example:
* Curaleaf GELATO CAKE FLOWER 3.5G $50, after discount $35.
* Kanavis GELATO CAKE FLOWER 3.5G $45, after standard monday/saturday discount(20%) $36.
So if you get the amazing holy crap deal at Curaleaf you are only saving $1 over most dispensaries in the area on a flower sale day. Oh yeah and on top of it they are charging $50 an eighth for DC Gelato Cake??????
 
Another great example is today Sweet Buds in Frederick is doing 20% off any cart. A Cresco Nurse Jackie LLR is $65, so after discount $52. At Kannavis they sell Cresco Nurse Jackie LLR at $60, and today it is 20% off any concentrate, making the price after discount $48.
 
So with that being said the goal of the project currently is the following:
* Track prices of X strains over time, grabbing prices from X dispensaries
* Track Posted deals for each dispensary
Difficulties the project may face:
* Determining if items are the same batch. I plan to match THC levels within a certain threshold but I know many places may not even update these numbers.
* Calculating deals may be a manual task due to most dispensaries posting deals as one large text post.
 
What I am hoping to get from the community:
* Lists of items or strains worth tracking
* Dispensaries that need price monitoring
 
Anyway I would love to hear everyone's thoughts and suggestions!
 
TLDR: Making a database that is going to track pricing of community selected items.
Edit:
Here is the Google Sheet I am actively working on to display this data: https://docs.google.com/spreadsheets/d/1laHReMXNoyf0IOlSGaBJUbpYJq7IL6KTrsbJcVUHGmY/edit?usp=sharing
Current Functionality: Tracking 6 Dispensaries, Gelato Cake, updates every 4 hours.
Data can be best viewed on the "Data in a Table" sheet.
 
Edit 2:
Here is an example website I am working on that ingests the data from weedmaps.
Please ignore any errors it is currently under development.
https://entdex.app/
p6yn2m
MDEnts
AccidentalDavid
t3_p6yn2m
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/
8/18/2021 7:45:03 PM
8/19/2021 4:36:19 PM
False
False
42
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
MD Medical Marijuana Price Tracking Project
False
0.98
p6yn2m
0
20
11
11
13
2.91479820627803
5
1.12107623318386
0
0
238
53.3632286995516
446
128, 128, 128
3.01797540208136
Dash Dot Dot
49.9229625625084
No
820
Commented
8/19/2021 12:34:13 AM
Here is an example website I am working on that ingests the data from weedmaps.
Please ignore any errors it is currently under development.
https://entdex.app/
h9h32c5
MDEnts
AccidentalDavid
t1_h9h32c5
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h32c5/
8/19/2021 12:34:13 AM
8/19/2021 4:36:28 PM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
0
p6yn2m
True
False
False
0
20
11
11
0
0
2
8.33333333333333
0
0
10
41.6666666666667
24
128, 128, 128
3.01797540208136
Dash Dot Dot
49.9229625625084
No
819
Commented
8/18/2021 10:35:20 PM
Here is the Google Sheet I am actively working on to display this data:
https://docs.google.com/spreadsheets/d/1laHReMXNoyf0IOlSGaBJUbpYJq7IL6KTrsbJcVUHGmY/edit?usp=sharing
Current Functionality:
Tracking 6 Dispensaries, Gelato Cake, updates every 4 hours.
Data can be best viewed on the "Data in a Table" sheet.
h9gnxef
MDEnts
AccidentalDavid
t1_h9gnxef
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gnxef/
8/18/2021 10:35:20 PM
1/1/0001 12:00:00 AM
False
False
4
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
0
p6yn2m
True
False
False
0
20
11
11
1
2.7027027027027
0
0
0
0
19
51.3513513513514
37
128, 128, 128
3.01797540208136
Dash Dot Dot
49.9229625625084
No
818
Posted
8/18/2021 7:45:03 PM
Hey Everybody!
I am in the process of setting up a web scraping system using Octoparse to start scraping the prices of popular strains here in MD.
I have started this project because the amount of price discrepancy I see when browsing weedmaps, leafly, and direct dispensary sites. Many times dispensaries are running seemingly great deals but after research you will find they are selling their products $5 more than everybody else, just recently changed their prices, or it is still $10+ a gram.
 
For example you may see at Curaleaf in Gaithersburg they are doing a deal today 30% off all vendors!! Wow you think that is crazy.... well let's use the ever beloved District Cannabis Gelato Cake for an example:
* Curaleaf GELATO CAKE FLOWER 3.5G $50, after discount $35.
* Kanavis GELATO CAKE FLOWER 3.5G $45, after standard monday/saturday discount(20%) $36.
So if you get the amazing holy crap deal at Curaleaf you are only saving $1 over most dispensaries in the area on a flower sale day. Oh yeah and on top of it they are charging $50 an eighth for DC Gelato Cake??????
 
Another great example is today Sweet Buds in Frederick is doing 20% off any cart. A Cresco Nurse Jackie LLR is $65, so after discount $52. At Kannavis they sell Cresco Nurse Jackie LLR at $60, and today it is 20% off any concentrate, making the price after discount $48.
 
So with that being said the goal of the project currently is the following:
* Track prices of X strains over time, grabbing prices from X dispensaries
* Track Posted deals for each dispensary
Difficulties the project may face:
* Determining if items are the same batch. I plan to match THC levels within a certain threshold but I know many places may not even update these numbers.
* Calculating deals may be a manual task due to most dispensaries posting deals as one large text post.
 
What I am hoping to get from the community:
* Lists of items or strains worth tracking
* Dispensaries that need price monitoring
 
Anyway I would love to hear everyone's thoughts and suggestions!
 
TLDR: Making a database that is going to track pricing of community selected items.
Edit:
Here is the Google Sheet I am actively working on to display this data: https://docs.google.com/spreadsheets/d/1laHReMXNoyf0IOlSGaBJUbpYJq7IL6KTrsbJcVUHGmY/edit?usp=sharing
Current Functionality: Tracking 6 Dispensaries, Gelato Cake, updates every 4 hours.
Data can be best viewed on the "Data in a Table" sheet.
 
Edit 2:
Here is an example website I am working on that ingests the data from weedmaps.
Please ignore any errors it is currently under development.
https://entdex.app/
p6yn2m
MDEnts
AccidentalDavid
t3_p6yn2m
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/
8/18/2021 7:45:03 PM
8/19/2021 4:36:19 PM
False
False
44
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
MD Medical Marijuana Price Tracking Project
False
0.98
p6yn2m
0
20
11
11
128, 128, 128
3.01797540208136
Dash Dot Dot
49.9229625625084
No
817
Commented
8/19/2021 12:34:13 AM
Here is an example website I am working on that ingests the data from weedmaps.
Please ignore any errors it is currently under development.
https://entdex.app/
h9h32c5
MDEnts
AccidentalDavid
t1_h9h32c5
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9h32c5/
8/19/2021 12:34:13 AM
8/19/2021 4:36:28 PM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
0
p6yn2m
True
False
False
0
20
11
11
128, 128, 128
3.01797540208136
Dash Dot Dot
49.9229625625084
No
816
Commented
8/18/2021 10:35:20 PM
Here is the Google Sheet I am actively working on to display this data:
https://docs.google.com/spreadsheets/d/1laHReMXNoyf0IOlSGaBJUbpYJq7IL6KTrsbJcVUHGmY/edit?usp=sharing
Current Functionality:
Tracking 6 Dispensaries, Gelato Cake, updates every 4 hours.
Data can be best viewed on the "Data in a Table" sheet.
h9gnxef
MDEnts
AccidentalDavid
t1_h9gnxef
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gnxef/
8/18/2021 10:35:20 PM
1/1/0001 12:00:00 AM
False
False
5
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
0
p6yn2m
True
False
False
0
20
11
11
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
777
Commented
8/18/2021 9:48:04 PM
very cool, I would say the district cannabis strains seem to be super popular right now so those would be good to track. Also Dixie elixers are one that comes to mind
another cool think would be if you can get terp numbers so as to identify a "good" batch (on paper). That's something I always check for but is often buried deep in the menu (or not there at all) and kind of a pain to check when comparing strains
h9ghsls
MDEnts
spectrosdog
t1_h9ghsls
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9ghsls/
8/18/2021 9:48:04 PM
8/18/2021 10:00:59 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
1
p6yn2m
False
False
False
0
8
11
11
7
8.64197530864197
1
1.23456790123457
0
0
29
35.8024691358025
81
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
776
RepliedTo
8/18/2021 9:51:48 PM
Yeah that is the big issue is not everyone posts the numbers, I could look into matching posted terp #s to all results with the same THC level.
h9gi9zt
MDEnts
AccidentalDavid
t1_h9gi9zt
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gi9zt/
8/18/2021 9:51:48 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9ghsls
t1_h9ghsls
h9ghsls
1
p6yn2m
True
False
False
1
4
11
11
0
0
1
3.57142857142857
0
0
12
42.8571428571429
28
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
775
RepliedTo
8/18/2021 9:57:57 PM
yea some of those websites are a mess unfortunately. Yea I guess I was thinking like if you had one reliable source for that info then you could automatically pair it as you said with the THC amount to deduce that it came from the same batch.
h9gj2pu
MDEnts
spectrosdog
t1_h9gj2pu
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gj2pu/
8/18/2021 9:57:57 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gi9zt
t1_h9gi9zt
h9gi9zt
0
p6yn2m
False
False
False
2
8
11
11
1
2.12765957446809
2
4.25531914893617
0
0
17
36.1702127659574
47
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
774
Commented
8/18/2021 9:48:04 PM
very cool, I would say the district cannabis strains seem to be super popular right now so those would be good to track. Also Dixie elixers are one that comes to mind
another cool think would be if you can get terp numbers so as to identify a "good" batch (on paper). That's something I always check for but is often buried deep in the menu (or not there at all) and kind of a pain to check when comparing strains
h9ghsls
MDEnts
spectrosdog
t1_h9ghsls
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9ghsls/
8/18/2021 9:48:04 PM
8/18/2021 10:00:59 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6yn2m
t3_p6yn2m
p6yn2m
1
p6yn2m
False
False
False
0
8
11
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
773
RepliedTo
8/18/2021 9:51:48 PM
Yeah that is the big issue is not everyone posts the numbers, I could look into matching posted terp #s to all results with the same THC level.
h9gi9zt
MDEnts
AccidentalDavid
t1_h9gi9zt
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gi9zt/
8/18/2021 9:51:48 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9ghsls
t1_h9ghsls
h9ghsls
1
p6yn2m
True
False
False
1
4
11
11
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
Yes
772
RepliedTo
8/18/2021 9:57:57 PM
yea some of those websites are a mess unfortunately. Yea I guess I was thinking like if you had one reliable source for that info then you could automatically pair it as you said with the THC amount to deduce that it came from the same batch.
h9gj2pu
MDEnts
spectrosdog
t1_h9gj2pu
https://www.reddit.com/r/MDEnts/comments/p6yn2m/md_medical_marijuana_price_tracking_project/h9gj2pu/
8/18/2021 9:57:57 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9gi9zt
t1_h9gi9zt
h9gi9zt
0
p6yn2m
False
False
False
2
8
11
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
771
RepliedTo
5/27/2022 1:27:48 AM
😄😅
ia51w0i
webscraping
mih4elll
t1_ia51w0i
https://www.reddit.com/r/webscraping/comments/uy5utj/deleted_by_user/ia51w0i/
5/27/2022 1:27:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ia51unt
t1_ia51unt
ia51unt
0
uy5utj
False
False
False
1
4
1
1
0
0
0
0
0
0
0
0
0
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
770
RepliedTo
5/27/2022 1:27:48 AM
😄😅
ia51w0i
webscraping
mih4elll
t1_ia51w0i
https://www.reddit.com/r/webscraping/comments/uy5utj/deleted_by_user/ia51w0i/
5/27/2022 1:27:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ia51unt
t1_ia51unt
ia51unt
0
uy5utj
False
False
False
1
4
1
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
769
RepliedTo
5/26/2022 9:53:58 PM
from where can this be purchased?
ia4b1g0
webscraping
Rapid1898
t1_ia4b1g0
https://www.reddit.com/r/webscraping/comments/uy5utj/deleted_by_user/ia4b1g0/
5/26/2022 9:53:58 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ia2lfjg
t1_ia2lfjg
ia2lfjg
0
uy5utj
False
False
False
1
4
48
48
0
0
0
0
0
0
1
16.6666666666667
6
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
768
RepliedTo
5/26/2022 9:53:58 PM
from where can this be purchased?
ia4b1g0
webscraping
Rapid1898
t1_ia4b1g0
https://www.reddit.com/r/webscraping/comments/uy5utj/deleted_by_user/ia4b1g0/
5/26/2022 9:53:58 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ia2lfjg
t1_ia2lfjg
ia2lfjg
0
uy5utj
False
False
False
1
4
48
48
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
767
Posted
10/10/2022 4:43:42 PM
I've had a lot of messages asking what tools I use for SEO, so I thought I'd share some of my favorites right now;
✅ Frase - Solid for ripping apart SERP's web pages and has awesome features
✅ WriterZen - Clusters, clusters, clusters, topical research
✅ CogSEO - Onpage analysis
✅ JetOctopus - Gold for cloud-crawling large eCommerce websites
✅ ScreamingFrog - Classic tool that's solid as ever. Too many epic features to list
✅ Google Sheets - Still super important for my processes
✅ Ahrefs/SEMrush - Classics that are still big in the game of SEO
✅ Octoparse - Good for scraping data like FAQs
✅ Nightwatch - Checking SERPs
✅ Inlinks - Entities
✅ SEranking - Tracking of course
✅ Majestic - Separate the good from the bad and visualize tiered links
y0jia8
SEO
ranaanshul
t3_y0jia8
https://www.reddit.com/r/SEO/comments/y0jia8/the_best_seo_tools_for_ecommerce_seo/
10/10/2022 4:43:42 PM
1/1/0001 12:00:00 AM
False
False
6
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
The Best SEO Tools For ECommerce SEO
False
0.69
y0jia8
0
4
49
49
11
9.73451327433628
2
1.76991150442478
0
0
62
54.8672566371681
113
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
766
Posted
10/10/2022 4:43:42 PM
I've had a lot of messages asking what tools I use for SEO, so I thought I'd share some of my favorites right now;
✅ Frase - Solid for ripping apart SERP's web pages and has awesome features
✅ WriterZen - Clusters, clusters, clusters, topical research
✅ CogSEO - Onpage analysis
✅ JetOctopus - Gold for cloud-crawling large eCommerce websites
✅ ScreamingFrog - Classic tool that's solid as ever. Too many epic features to list
✅ Google Sheets - Still super important for my processes
✅ Ahrefs/SEMrush - Classics that are still big in the game of SEO
✅ Octoparse - Good for scraping data like FAQs
✅ Nightwatch - Checking SERPs
✅ Inlinks - Entities
✅ SEranking - Tracking of course
✅ Majestic - Separate the good from the bad and visualize tiered links
y0jia8
SEO
ranaanshul
t3_y0jia8
https://www.reddit.com/r/SEO/comments/y0jia8/the_best_seo_tools_for_ecommerce_seo/
10/10/2022 4:43:42 PM
1/1/0001 12:00:00 AM
False
False
7
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
The Best SEO Tools For ECommerce SEO
False
0.73
y0jia8
0
4
49
49
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
765
Commented
10/10/2022 7:02:26 PM
Great list! Going to help a lot of people.
irsnt8g
SEO
ApolloIsMyDog29
t1_irsnt8g
https://www.reddit.com/r/SEO/comments/y0jia8/the_best_seo_tools_for_ecommerce_seo/irsnt8g/
10/10/2022 7:02:26 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
y0jia8
t3_y0jia8
y0jia8
0
y0jia8
False
False
False
0
4
49
49
1
11.1111111111111
0
0
0
0
5
55.5555555555556
9
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
764
Commented
10/10/2022 7:02:26 PM
Great list! Going to help a lot of people.
irsnt8g
SEO
ApolloIsMyDog29
t1_irsnt8g
https://www.reddit.com/r/SEO/comments/y0jia8/the_best_seo_tools_for_ecommerce_seo/irsnt8g/
10/10/2022 7:02:26 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
y0jia8
t3_y0jia8
y0jia8
0
y0jia8
False
False
False
0
4
49
49
128, 128, 128
3
Solid
50
Yes
760
RepliedTo
3/23/2023 10:52:37 AM
Did you find a fix?
jdc748i
learnpython
NavaGile
t1_jdc748i
https://www.reddit.com/r/learnpython/comments/svc1vh/scraping_all_the_posts_from_a_linkedin_account_is/jdc748i/
3/23/2023 10:52:37 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hyr4ugd
t1_hyr4ugd
hyr4ugd
1
svc1vh
False
False
False
1
1
25
25
0
0
0
0
0
0
2
40
5
128, 128, 128
3
Solid
50
Yes
759
RepliedTo
3/24/2023 11:12:39 AM
I did yes (custom code) worked well for a long time then broke, the engineer who did the work is busy on something else so given up with it ...for now. Wasnt mission critical to me.
jdh5m6n
learnpython
drivenkey
t1_jdh5m6n
https://www.reddit.com/r/learnpython/comments/svc1vh/scraping_all_the_posts_from_a_linkedin_account_is/jdh5m6n/
3/24/2023 11:12:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jdc748i
t1_jdc748i
jdc748i
0
svc1vh
False
False
False
2
1
25
25
3
8.33333333333333
2
5.55555555555556
0
0
11
30.5555555555556
36
128, 128, 128
3
Solid
50
No
761
Commented
2/28/2022 7:48:56 AM
Yes its bit of a pain as they have changed backends a few times but at one point had a headless solution to scrape posts and email me on matching keywords. Worked well. Broken now (again) but might try to fix.
hyr4ugd
learnpython
drivenkey
t1_hyr4ugd
https://www.reddit.com/r/learnpython/comments/svc1vh/scraping_all_the_posts_from_a_linkedin_account_is/hyr4ugd/
2/28/2022 7:48:56 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
svc1vh
t3_svc1vh
svc1vh
1
svc1vh
False
False
False
0
1
25
25
2
4.8780487804878
2
4.8780487804878
0
0
17
41.4634146341463
41
128, 128, 128
3
Solid
50
Yes
758
Commented
2/18/2022 8:38:41 AM
Should be possible with selenium... Is it a public visible profile?
hxfa7b3
learnpython
devnull10
t1_hxfa7b3
https://www.reddit.com/r/learnpython/comments/svc1vh/scraping_all_the_posts_from_a_linkedin_account_is/hxfa7b3/
2/18/2022 8:38:41 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
svc1vh
t3_svc1vh
svc1vh
1
svc1vh
False
False
False
0
1
25
25
0
0
0
0
0
0
5
45.4545454545455
11
128, 128, 128
3
Solid
50
Yes
757
RepliedTo
2/18/2022 8:46:01 AM
Yes, it's publicly visible.
hxfaqp0
learnpython
Cool-Pineapple-123
t1_hxfaqp0
https://www.reddit.com/r/learnpython/comments/svc1vh/scraping_all_the_posts_from_a_linkedin_account_is/hxfaqp0/
2/18/2022 8:46:01 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxfa7b3
t1_hxfa7b3
hxfa7b3
0
svc1vh
True
False
False
1
1
25
25
0
0
0
0
0
0
3
75
4
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
763
Posted
2/18/2022 7:29:03 AM
Hey guys, is it possible to go to a LinkedIn account -> activity -> posts and scrape all of the posts that the (open) account has posted? I tried to do it using Octoparse and ScrapeStorm, but it doesn't really work well when you open the posts section.
Are there any tools (or tricks) free or paid to will allow me to do this?
svc1vh
learnpython
Cool-Pineapple-123
t3_svc1vh
https://www.reddit.com/r/learnpython/comments/svc1vh/scraping_all_the_posts_from_a_linkedin_account_is/
2/18/2022 7:29:03 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping all the posts from a LinkedIn account, is it possible?
False
0.5
svc1vh
0
4
25
25
3
4.6875
0
0
0
0
24
37.5
64
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
762
Posted
2/18/2022 7:43:28 AM
Hey guys, I'm new to this and I want to know if there is any way that I can scrape all the posts from a certain LinkedIn profile. The purpose of this is to see the content of the posts along with likes and comments just to see what type of content works the best.
I tried to do this using Octoparse and ScrapeStorm but they have an issue when you open the "posts" section.
Is there any tool (or trick) free or paid that will allow me to do this?
svca3s
scrapy
Cool-Pineapple-123
t3_svca3s
https://www.reddit.com/r/scrapy/comments/svca3s/scraping_linkedin_posts_from_a_specific_profile/
2/18/2022 7:43:28 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping LinkedIn posts from a specific profile?
False
1
svca3s
0
4
25
25
4
4.3956043956044
2
2.1978021978022
0
0
25
27.4725274725275
91
128, 128, 128
3
Solid
50
Yes
754
Commented
1/4/2023 8:08:02 PM
Haven't tried but this project [https://github.com/crawlab-team/crawlab](https://github.com/crawlab-team/crawlab) looks promising.
j2ya6i1
selfhosted
hieudt
t1_j2ya6i1
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2ya6i1/
1/4/2023 8:08:02 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
102gfg4
t3_102gfg4
102gfg4
1
102gfg4
False
False
False
0
1
8
8
1
5.26315789473684
0
0
0
0
12
63.1578947368421
19
128, 128, 128
3
Solid
50
Yes
753
RepliedTo
1/5/2023 5:02:29 AM
This looks very nice. I will try. Thank you very much.
j30gq7r
selfhosted
Sinclairxer
t1_j30gq7r
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j30gq7r/
1/5/2023 5:02:29 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2ya6i1
t1_j2ya6i1
j2ya6i1
0
102gfg4
True
False
False
1
1
8
8
2
18.1818181818182
0
0
0
0
5
45.4545454545455
11
128, 128, 128
3
Solid
50
No
752
Commented
1/4/2023 7:37:10 PM
If you want to just scrape words, images and the formatting on a web page, you can use [trilium notes](https://github.com/zadam/trilium) along with their web clipper [browser plugin](https://github.com/zadam/trilium/wiki/Web-clipper).
With the web clipper plugin you can copy the whole page as it is, images an all to your local trilium instance.
j2y52t1
selfhosted
codecarter
t1_j2y52t1
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2y52t1/
1/4/2023 7:37:10 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
102gfg4
t3_102gfg4
102gfg4
1
102gfg4
False
False
False
0
1
8
8
0
0
0
0
0
0
31
50
62
128, 128, 128
3
Solid
50
No
751
RepliedTo
1/4/2023 7:38:16 PM
Available as a standalone exe as well for Windows. Works just the same as the linux self hosted instance.
j2y599f
selfhosted
codecarter
t1_j2y599f
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2y599f/
1/4/2023 7:38:16 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2y52t1
t1_j2y52t1
j2y52t1
0
102gfg4
False
False
False
1
1
8
8
3
15.7894736842105
0
0
0
0
8
42.1052631578947
19
128, 128, 128
3
Solid
50
Yes
750
Commented
1/4/2023 6:07:01 AM
Look at this YouTube Channel: https://youtube.com/@JohnWatsonRooney
Loads if infos about Webscraping
j2vk4n4
selfhosted
cellerich
t1_j2vk4n4
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2vk4n4/
1/4/2023 6:07:01 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
102gfg4
t3_102gfg4
102gfg4
1
102gfg4
False
False
False
0
1
8
8
0
0
0
0
0
0
6
60
10
128, 128, 128
3
Solid
50
Yes
749
RepliedTo
1/4/2023 6:17:28 AM
That is really interesting. Thank you very much.
j2vl3wn
selfhosted
Sinclairxer
t1_j2vl3wn
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2vl3wn/
1/4/2023 6:17:28 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2vk4n4
t1_j2vk4n4
j2vk4n4
0
102gfg4
True
False
False
1
1
8
8
2
25
0
0
0
0
3
37.5
8
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
748
Commented
1/3/2023 8:00:10 PM
Python(programming language) + requests(http protocol caller) + selenium(JavaScript runner if needed) + bs4(html parser) + pyexcel(python excel wrapper)
j2t6svm
selfhosted
Square_Lawfulness_33
t1_j2t6svm
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2t6svm/
1/3/2023 8:00:10 PM
1/1/0001 12:00:00 AM
False
False
8
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
102gfg4
t3_102gfg4
102gfg4
1
102gfg4
False
False
False
0
2
8
8
1
5.26315789473684
0
0
0
0
16
84.2105263157895
19
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
747
RepliedTo
1/3/2023 9:01:07 PM
It is web page with apartments that i want to watch prices and get notifications when new apartment arrive.
j2tgxus
selfhosted
Sinclairxer
t1_j2tgxus
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2tgxus/
1/3/2023 9:01:07 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2tfyof
t1_j2tfyof
j2tfyof
1
102gfg4
True
False
False
3
4
8
8
0
0
0
0
0
0
7
36.8421052631579
19
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
746
RepliedTo
1/3/2023 9:03:24 PM
Does [zillow](https://www.zillow.com/) not do that already?
j2thbem
selfhosted
Square_Lawfulness_33
t1_j2thbem
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2thbem/
1/3/2023 9:03:24 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2tgxus
t1_j2tgxus
j2tgxus
1
102gfg4
False
False
False
4
2
8
8
0
0
0
0
0
0
3
30
10
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
745
RepliedTo
1/3/2023 9:05:59 PM
Something similar.
j2thqy5
selfhosted
Sinclairxer
t1_j2thqy5
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2thqy5/
1/3/2023 9:05:59 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2thbem
t1_j2thbem
j2thbem
0
102gfg4
True
False
False
5
4
8
8
0
0
0
0
0
0
2
100
2
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
744
Commented
1/3/2023 7:45:14 PM
Don't know of a specific scraper but I use Node-RED to do Web scraping in general pretty successfully.
You can write out to CSV pretty easily.
Not exactly what you were asking for but thought it was worth a mention.
j2t4buu
selfhosted
cbarker151
t1_j2t4buu
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2t4buu/
1/3/2023 7:45:14 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
102gfg4
t3_102gfg4
102gfg4
1
102gfg4
False
False
False
0
2
8
8
4
9.75609756097561
0
0
0
0
16
39.0243902439024
41
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
743
RepliedTo
1/3/2023 7:53:59 PM
Is it also good with pagination?
j2t5s88
selfhosted
Sinclairxer
t1_j2t5s88
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2t5s88/
1/3/2023 7:53:59 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2t4buu
t1_j2t4buu
j2t4buu
1
102gfg4
True
False
False
1
4
8
8
1
16.6666666666667
0
0
0
0
1
16.6666666666667
6
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
742
RepliedTo
1/3/2023 9:46:31 PM
Personally not something I have needed to do, so not sure. But I know there is a selenium plugin for node red and pretty sure selenium can handle pagination OK.
What about Scrapy in Python?
j2tohia
selfhosted
cbarker151
t1_j2tohia
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2tohia/
1/3/2023 9:46:31 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2t5s88
t1_j2t5s88
j2t5s88
1
102gfg4
False
False
False
2
2
8
8
1
2.85714285714286
0
0
0
0
16
45.7142857142857
35
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
741
RepliedTo
1/4/2023 1:38:20 PM
I will check it out.
Thank you very much.
j2wmj0l
selfhosted
Sinclairxer
t1_j2wmj0l
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2wmj0l/
1/4/2023 1:38:20 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2tohia
t1_j2tohia
j2tohia
0
102gfg4
True
False
False
3
4
8
8
1
11.1111111111111
0
0
0
0
3
33.3333333333333
9
128, 128, 128
3
Solid
50
Yes
740
Commented
1/3/2023 7:20:49 PM
You don't say what features are important or what about changedetection.io didn't work but maybe [ArchiveBox](https://archivebox.io/) or [Huginn](https://github.com/huginn/huginn#readme)
j2t0dbz
selfhosted
carrythen0thing
t1_j2t0dbz
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2t0dbz/
1/3/2023 7:20:49 PM
1/1/0001 12:00:00 AM
False
False
9
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
102gfg4
t3_102gfg4
102gfg4
1
102gfg4
False
False
False
0
1
8
8
2
7.14285714285714
0
0
0
0
12
42.8571428571429
28
128, 128, 128
3
Solid
50
Yes
739
RepliedTo
1/3/2023 7:28:00 PM
Yes you are right.
Change detection is checking for changes. I need something that will scrape webpage and make excell of it with data or something similar. I try also n8n, but I didn't like it.
j2t1jcp
selfhosted
Sinclairxer
t1_j2t1jcp
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/j2t1jcp/
1/3/2023 7:28:00 PM
1/3/2023 7:32:33 PM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2t0dbz
t1_j2t0dbz
j2t0dbz
0
102gfg4
True
False
False
1
1
8
8
1
2.77777777777778
0
0
0
0
16
44.4444444444444
36
128, 128, 128
3
Solid
50
Yes
731
Commented
11/1/2020 5:34:35 PM
oh my god, this is a sibling problem, a hard one.
Do you have to specificly use xpath? bcs I can help you with a CssSelector option...
In that case: First you can find the index from this "list" of properties where the value that you need is placed. (on the page from your example is the second), and you know they are 'strong' tags. So you just get the text from those tags and count until you get the match. Once you got that you just go to get the td on that index. With CssSelectors you can call the nth-child(n) to get the child at n position.
There is a practice in Css to find siblings, you can use the '+', so if you look for 'td + td' on the inspect search, where says 'Search HTML', you'll find all the tds that are adjacent siblings from another td.
Hope this help u
Edit: I recommend u this web to practice with Css Selectors and is really helpful [Css Diner](https://flukeout.github.io/)
gat68oo
learnprogramming
town_girl
t1_gat68oo
https://www.reddit.com/r/learnprogramming/comments/jm4oc1/help_with_xpath/gat68oo/
11/1/2020 5:34:35 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jm4oc1
t3_jm4oc1
jm4oc1
1
jm4oc1
False
False
False
0
1
2
2
2
1.12359550561798
2
1.12359550561798
0
0
74
41.5730337078652
178
128, 128, 128
3
Solid
50
Yes
730
RepliedTo
11/2/2020 3:40:59 AM
Thanks for the reply as I said I don't have much knowledge of programming and Octoparse is simple point and grab solution which uses Xpath to locate so thats the reason I am interested in Xpath though I will check the links and hopefully grab some knowledge of CSS as well.
gav3xjy
learnprogramming
kartikoli
t1_gav3xjy
https://www.reddit.com/r/learnprogramming/comments/jm4oc1/help_with_xpath/gav3xjy/
11/2/2020 3:40:59 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gat68oo
t1_gat68oo
gat68oo
0
jm4oc1
True
False
False
1
1
2
2
1
1.96078431372549
0
0
0
0
23
45.0980392156863
51
128, 128, 128
3
Solid
50
No
727
Posted
7/4/2022 7:02:20 AM
Hello everyone. I am relatively new to web scraping and there is a lot I need to learn. I want to ask about octoparse which I am using now a days. Would you mind telling me the approaches you use to avoid blockage from sites? Especially how do you create/where do you get free IP list for IP rotation?
Please answer considering that I am using octoparse and don't have sufficient expertise in scraping libraries (selenium, beautifulsoup etc) yet.
vr1hpj
webscraping
rd_md005
t3_vr1hpj
https://www.reddit.com/r/webscraping/comments/vr1hpj/octoparse_scraping/
7/4/2022 7:02:20 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse scraping
False
0.67
vr1hpj
0
1
38
38
2
2.5
1
1.25
0
0
35
43.75
80
128, 128, 128
3
Solid
50
No
726
Commented
7/4/2022 11:46:04 AM
Octoparse is cool but there are some website that has anti-scrapping blocking inplace - for example i can scrap some of my Nigerian E-commerce site but definitely some website like [https://www.supermart.ng/sub-category/new-additions/new-additions](https://www.supermart.ng/sub-category/new-additions/new-additions) and [https://shop.manoapp.com/en/categories/245-snacks](https://shop.manoapp.com/en/categories/245-snacks) can't be scrap by Octoparse not python - I have tried all iCould but all to no avail
iet39fh
webscraping
IamFromNigeria
t1_iet39fh
https://www.reddit.com/r/webscraping/comments/vr1hpj/octoparse_scraping/iet39fh/
7/4/2022 11:46:04 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
vr1hpj
t3_vr1hpj
vr1hpj
1
vr1hpj
False
False
False
0
1
38
38
1
1.17647058823529
2
2.35294117647059
0
0
39
45.8823529411765
85
128, 128, 128
3
Solid
50
No
725
RepliedTo
7/4/2022 12:03:10 PM
Not even if you use IP's from a different location? Or if you have a 10 second IP rotation in place?
iet4teu
webscraping
alee001
t1_iet4teu
https://www.reddit.com/r/webscraping/comments/vr1hpj/octoparse_scraping/iet4teu/
7/4/2022 12:03:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iet39fh
t1_iet39fh
iet39fh
0
vr1hpj
False
False
False
1
1
38
38
0
0
0
0
0
0
9
42.8571428571429
21
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
723
Commented
1/17/2020 3:35:28 AM
https://developer.bestbuy.com/apis
felvq3m
api
dfish17
t1_felvq3m
https://www.reddit.com/r/api/comments/epm6um/figuring_out_a_method_for_pulling_data_from/felvq3m/
1/17/2020 3:35:28 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
epm6um
t3_epm6um
epm6um
1
epm6um
False
False
False
0
2
26
26
0
0
0
0
0
0
0
0
0
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
722
RepliedTo
1/17/2020 3:54:15 AM
I've already taken a look at that and I would have to apply for a key at the very least
felx9rs
api
Rigg_Enderslaye
t1_felx9rs
https://www.reddit.com/r/api/comments/epm6um/figuring_out_a_method_for_pulling_data_from/felx9rs/
1/17/2020 3:54:15 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
felvq3m
t1_felvq3m
felvq3m
1
epm6um
True
False
False
1
4
26
26
0
0
0
0
0
0
6
30
20
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
721
RepliedTo
1/17/2020 5:03:38 AM
What’s your aversion to registering for a key? Most companies will allow for a large or even unlimited amount of transactions. They want people directing traffic to their products.
fem2ase
api
dfish17
t1_fem2ase
https://www.reddit.com/r/api/comments/epm6um/figuring_out_a_method_for_pulling_data_from/fem2ase/
1/17/2020 5:03:38 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
felx9rs
t1_felx9rs
felx9rs
1
epm6um
False
False
False
2
2
26
26
1
3.33333333333333
1
3.33333333333333
0
0
11
36.6666666666667
30
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
720
RepliedTo
1/17/2020 5:22:14 AM
Well I think the experience of building my own webcrawler/api would be great but I'm also not even sure how to request it in a correct manner. Im trying to create a tool for best buy employees, not something that would directly generate business for best buy.
fem3i7e
api
Rigg_Enderslaye
t1_fem3i7e
https://www.reddit.com/r/api/comments/epm6um/figuring_out_a_method_for_pulling_data_from/fem3i7e/
1/17/2020 5:22:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fem2ase
t1_fem2ase
fem2ase
2
epm6um
True
False
False
3
4
26
26
5
10.4166666666667
0
0
0
0
18
37.5
48
128, 128, 128
3
Solid
50
No
719
RepliedTo
1/17/2020 12:54:07 PM
I just took a look and signed up. Their API is dead simple.
Plus they give you working examples for every method. They use cURL, but if you're not familiar then this website translates that to Python. https://curl.trillworks.com/
fempfaf
api
whattodo-whattodo
t1_fempfaf
https://www.reddit.com/r/api/comments/epm6um/figuring_out_a_method_for_pulling_data_from/fempfaf/
1/17/2020 12:54:07 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fem3i7e
t1_fem3i7e
fem3i7e
0
epm6um
False
False
False
4
1
26
26
0
0
1
2.7027027027027
0
0
16
43.2432432432432
37
128, 128, 128
3
Solid
50
No
724
Posted
1/16/2020 5:08:49 PM
I'm completely new to API's and crawlers and I'm trying to figure out how to pull the SKU, Name, Sale Price, Normal Price, and the three GSP prices for all laptops on the site. I've tried Octoparse but it keeps crashing on me after like 14 lines. I assume an API could function a little better but I have no idea where to start
epm6um
api
Rigg_Enderslaye
t3_epm6um
https://www.reddit.com/r/api/comments/epm6um/figuring_out_a_method_for_pulling_data_from/
1/16/2020 5:08:49 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Figuring out a method for pulling data from Bestbuycom
False
1
epm6um
0
1
26
26
1
1.5625
1
1.5625
0
0
28
43.75
64
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
718
RepliedTo
1/17/2020 5:31:50 AM
Most APIs require a key. It's good security. Read their docs to find out how to go about requesting the key. It really isn't a big hurdle.
fem44h0
api
turningsteel
t1_fem44h0
https://www.reddit.com/r/api/comments/epm6um/figuring_out_a_method_for_pulling_data_from/fem44h0/
1/17/2020 5:31:50 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fem3i7e
t1_fem3i7e
fem3i7e
1
epm6um
False
False
False
4
4
26
26
1
3.7037037037037
0
0
0
0
13
48.1481481481481
27
128, 128, 128
3
Solid
50
Yes
717
RepliedTo
1/17/2020 5:34:06 AM
I guess its worth a shot. Where do I go from there to collect my data? I have zero experience with apis
fem49g0
api
Rigg_Enderslaye
t1_fem49g0
https://www.reddit.com/r/api/comments/epm6um/figuring_out_a_method_for_pulling_data_from/fem49g0/
1/17/2020 5:34:06 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fem44h0
t1_fem44h0
fem44h0
1
epm6um
True
False
False
5
1
26
26
1
4.54545454545455
0
0
0
0
8
36.3636363636364
22
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
716
RepliedTo
1/17/2020 6:04:23 AM
You have to make an http request to their endpoint to return the data. Read the docs to see which url returns the data that you are looking for. Since you need data for laptops, I would start by checking their product API
fem638z
api
turningsteel
t1_fem638z
https://www.reddit.com/r/api/comments/epm6um/figuring_out_a_method_for_pulling_data_from/fem638z/
1/17/2020 6:04:23 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fem49g0
t1_fem49g0
fem49g0
0
epm6um
False
False
False
6
4
26
26
0
0
0
0
0
0
19
44.1860465116279
43
128, 128, 128
3
Solid
50
No
715
Posted
6/14/2022 5:31:50 AM
Basically, I made a web scraper (I think?) in Octoparse. It successfully grabbed all the data I wanted from the site. But, the problem is there is over 200+ links from the data that I'd like to hyperlink to different words.
I scraped a dictionary site, and from the scraping it separated the word from the link, and I would like to put them back together for easier access.
​
(Ex: hyperlink: dictionary. com/apple to the word "apple", then continue to the next data entry hyperlink: dictionary. com/banana to the word "banana" and so forth)
I've seen a lot of tutorials for linking multiple links to **one** word but not different links to different words. (which should be relatively easy since I have all the original words & links, it's just hyperlinking them together would take too much time since there's SOO MANY!).
Any advice on specific programs (google sheets, etc) or codes that I can use to put the hyperlinks together for me would be great!
TIA!
​
P.S. I put this in the "programming" flair because I'm unsure of what flair would match best for this, so feel free to let me know if another flair better describes this problem.
vbw7av
techsupport
PastyBums
t3_vbw7av
https://www.reddit.com/r/techsupport/comments/vbw7av/a_programcode_to_create_different_hyperlinks_to/
6/14/2022 5:31:50 AM
6/14/2022 5:36:06 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
A program/code to create different hyperlinks to different words
False
0.76
vbw7av
0
1
50
50
7
3.38164251207729
6
2.89855072463768
0
0
88
42.512077294686
207
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
714
Commented
6/14/2022 6:24:10 AM
If the word is part of the URL, why not write a small script that parses the URL and puts it together the way you want it?
ETA:
In python, for example, without using any libraries:
data = [
"https://www.dictionary.com/apple",
"https://www.dictionary.com/ball",
"https://www.dictionary.com/cat",
...
"https://www.dictionary.com/xylophone",
"https://www.dictionary.com/yak",
"https://www.dictionary.com/zebra"
]
with open('dictionary.html', 'w') as file:
for item in data:
term = item.rsplit('/', 1)[1].title()
file.write(f'<a href="{item}">{term}</a><br>')
Will give you a barebones HTML file with hyperlinks to all your terms.
icats5k
techsupport
htepO
t1_icats5k
https://www.reddit.com/r/techsupport/comments/vbw7av/a_programcode_to_create_different_hyperlinks_to/icats5k/
6/14/2022 6:24:10 AM
6/14/2022 9:26:57 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
vbw7av
t3_vbw7av
vbw7av
1
vbw7av
False
False
False
0
2
50
50
0
0
0
0
0
0
57
52.2935779816514
109
128, 128, 128
3
Solid
50
Yes
713
RepliedTo
6/14/2022 1:33:37 PM
I have no idea how to write scripts or how to parse URLS, it took me 5 hours just to make the scraper itself in Octoparse.
So just to clarify, the “term” in the code you mentioned here is the link, while the “item” in the code is the actual word itself I want to connect it to?
I really know nothing in regards to coding, so sorry if my question is stupid.
icbsg9z
techsupport
PastyBums
t1_icbsg9z
https://www.reddit.com/r/techsupport/comments/vbw7av/a_programcode_to_create_different_hyperlinks_to/icbsg9z/
6/14/2022 1:33:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
icats5k
t1_icats5k
icats5k
1
vbw7av
True
False
False
1
1
50
50
0
0
2
2.73972602739726
0
0
28
38.3561643835616
73
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
712
RepliedTo
6/14/2022 4:59:16 PM
It's the opposite, actually.
If you'd like a line-by-line breakdown of the script so you can try it yourself, I can do that.
If you'd like to upload the full list of URLs somewhere, I can write a script to generate an HTML file with clickable links to each word.
icclb6c
techsupport
htepO
t1_icclb6c
https://www.reddit.com/r/techsupport/comments/vbw7av/a_programcode_to_create_different_hyperlinks_to/icclb6c/
6/14/2022 4:59:16 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
icbsg9z
t1_icbsg9z
icbsg9z
0
vbw7av
False
False
False
2
2
50
50
0
0
1
1.92307692307692
0
0
21
40.3846153846154
52
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
711
Posted
6/7/2022 7:38:41 AM
Daten sind immer wichtiger für jede. Möchten Sie mehr über Web-Daten-Extraktion wissen? Klicken Sie auf die Website [hier](https://www.octoparse.com/blog/web-data-extraction-2020), auf der schauen Sie nach: 1. Was ist Web-Extraktion? 2. Vorteile von Web-Extraktion 3. Wie funktioniert die Web-Extraktion? und mehr...
https://preview.redd.it/hqykaf3hk5491.png?width=1600&format=png&auto=webp&v=enabled&s=e58dd15e02cfed7ecedcde99cee61390d9e56cff
v6pi3t
u_Octoparse_de
Octoparse_de
t3_v6pi3t
https://www.reddit.com/r/u_Octoparse_de/comments/v6pi3t/web_data_extraktion_the_definitive_guide_2022/
6/7/2022 7:38:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Data Extraktion: The Definitive Guide 2022
False
1
v6pi3t
0
100
1
1
0
0
2
3.84615384615385
0
0
28
53.8461538461538
52
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
710
Posted
6/8/2022 6:19:43 AM
EXTRA 10% Rabatt auf alles NUR am 15. Juni!
【Standard Jahrplan】$271 sparend + GRATIS Crawler + 1-zu-1 Training
【Profi Jahrplan】$800 sparend + GRATIS Crawler\*3 + 1-zu-1 Training\*3
Klicke hier und erhalte mehr Angebote: [https://www.octoparse.de/summer-sale-2022](https://www.octoparse.de/summer-sale-2022?fbclid=IwAR32xHQeVKsiCP5ke3F4F-44aAM7oDVi_S8HODxpy4IrBj_5J3eLgunnWH0)
&#x200B;
[Octoparse Sommerverkauf](https://preview.redd.it/babqj3tabc491.png?width=800&format=png&auto=webp&v=enabled&s=764f263344ba985c23f69dfcf9c421817921baf0)
v7io1r
u_Octoparse_de
Octoparse_de
t3_v7io1r
https://www.reddit.com/r/u_Octoparse_de/comments/v7io1r/octoparse_webscraping_sommerverkauf/
6/8/2022 6:19:43 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse Webscraping Sommerverkauf
False
1
v7io1r
0
100
1
1
0
0
0
0
0
0
50
64.1025641025641
78
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
709
Posted
6/15/2022 6:10:39 AM
Nur heute: EXTRA 10% Rabatt für Alles und mehr Geschenk!
【Standard Jahr】 $271 sparend+GRATIS Crawler + 1-zu-1 Training
【Profi Jahr】$800 sparend + GRATIS Crawler\*3 + 1-zu-1 Training\*3
Klicke hier: [https://www.octoparse.de/summer-sale-2022](https://www.octoparse.de/summer-sale-2022)
&#x200B;
https://preview.redd.it/zxh557o88q591.png?width=800&format=png&auto=webp&v=enabled&s=52df2cd0ef1735b00936de2900dcbb43dfe2e345
vcnryl
u_Octoparse_de
Octoparse_de
t3_vcnryl
https://www.reddit.com/r/u_Octoparse_de/comments/vcnryl/sommerverkauf_2022_startet_jetzt/
6/15/2022 6:10:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Sommerverkauf 2022 startet jetzt!
False
1
vcnryl
0
100
1
1
0
0
0
0
0
0
31
62
50
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
708
Posted
6/15/2022 6:10:39 AM
Nur heute: EXTRA 10% Rabatt für Alles und mehr Geschenk!
【Standard Jahr】 $271 sparend+GRATIS Crawler + 1-zu-1 Training
【Profi Jahr】$800 sparend + GRATIS Crawler\*3 + 1-zu-1 Training\*3
Klicke hier: [https://www.octoparse.de/summer-sale-2022](https://www.octoparse.de/summer-sale-2022)
&#x200B;
https://preview.redd.it/zxh557o88q591.png?width=800&format=png&auto=webp&v=enabled&s=52df2cd0ef1735b00936de2900dcbb43dfe2e345
vcnryl
u_Octoparse_de
Octoparse_de
t3_vcnryl
https://www.reddit.com/r/u_Octoparse_de/comments/vcnryl/sommerverkauf_2022_startet_jetzt/
6/15/2022 6:10:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Sommerverkauf 2022 startet jetzt!
False
1
vcnryl
0
100
1
1
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
707
Posted
6/15/2022 6:09:45 AM
Nur heute: EXTRA 10% Rabatt für Alles und mehr Geschenk!
【Standard Jahr】 $271 sparend+GRATIS Crawler + 1-zu-1 Training
【Profi Jahr】$800 sparend + GRATIS Crawler\*3 + 1-zu-1 Training\*3
Klicke hier: [https://www.octoparse.de/summer-sale-2022](https://www.octoparse.de/summer-sale-2022)
vcnrgp
u_Octoparse_de
Octoparse_de
t3_vcnrgp
https://www.reddit.com/r/u_Octoparse_de/comments/vcnrgp/sommerverkauf_2022_startet_jetzt/
6/15/2022 6:09:45 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Sommerverkauf 2022 startet jetzt!
False
1
vcnrgp
0
100
1
1
0
0
0
0
0
0
30
62.5
48
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
706
Posted
6/15/2022 6:09:45 AM
Nur heute: EXTRA 10% Rabatt für Alles und mehr Geschenk!
【Standard Jahr】 $271 sparend+GRATIS Crawler + 1-zu-1 Training
【Profi Jahr】$800 sparend + GRATIS Crawler\*3 + 1-zu-1 Training\*3
Klicke hier: [https://www.octoparse.de/summer-sale-2022](https://www.octoparse.de/summer-sale-2022)
vcnrgp
u_Octoparse_de
Octoparse_de
t3_vcnrgp
https://www.reddit.com/r/u_Octoparse_de/comments/vcnrgp/sommerverkauf_2022_startet_jetzt/
6/15/2022 6:09:45 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Sommerverkauf 2022 startet jetzt!
False
1
vcnrgp
0
100
1
1
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
705
Posted
2/1/2023 2:26:08 AM
**Artikel Quelle:** [https://octoparse.de/blog/3-methoden-zum-scrapen-der-daten-aus-einer-tabelle](https://octoparse.de/blog/3-methoden-zum-scrapen-der-daten-aus-einer-tabelle)
Auf Websites werden viele Daten in einem Tabellenformat dargestellt. Es könnten jedoch zu einer schwierigen Aufgabe kommen, die Tabelledaten auf einem lokalen Computer zu speichern. Das Problem ist, dass die Daten in HTML eingebettet sind und nicht in einem strukturierten Format wie CSV heruntergeladen werden können. In diesem Fall ist Web Scraping der einfachste Weg, um die Daten zu erhalten.
Hier möchte ich Ihnen 3 Methoden vorstellen, damit Sie die Daten aus einer Tabelle einfach und schnell scrapen können.
# Octoparse
Octoparse ist ein leistungsfähiges Web-Scraping-Tool, mit dem Sie in kurzer Zeit Daten in großem Umfang extrahieren können. Octoparse ist einfach zu bedienen. Durch Ziehen und Ablegen können Sie ganz einfach einen Arbeitsablauf erstellen, der die benötigten Informationen von jeder beliebigen Website abruft.
https://preview.redd.it/owc7nesmmhfa1.png?width=800&format=png&auto=webp&v=enabled&s=42dbc5eb056420564d5c01e8a6ce410102bd7060
**Die Schritte zum Daten Scraping mit Octoparse sehen Sie darunter nach.**
**✅ Schritt 1:** Klicken Sie auf “Advanced Mode”, um ein neues Projekt zu starten.
**✅ Schritt 2:** Geben Sie die Ziel-URL in das Feld ein und klicken Sie auf “Save URL”, um die Website im integrierten Octoparse-Browser zu öffnen.
**✅ Schritt 3:** Erstellen Sie das Umblättern mit 3 Klicks:
a) Klicken Sie auf “B” im Browser
b) Klicken Sie in “Action Tips” auf “Select all””.
c) Klicken Sie auf “Loop click each URL” in “Action Tips”.
Jetzt können wir sehen, dass ein “Pagination Loop” in der Workflow-Box erstellt wurde.
**✅ Schritt 4:** Konfigurieren Sie die Aufgabe
a) Klicken Sie auf die erste Zelle in der ersten Zeile der Tabelle
b) Klicken Sie auf das Erweiterungssymbol in “Action Tips”, bis die gesamte Zeile grün hervorgehoben ist (normalerweise sollte das Tag TR sein).
c) Klicken Sie auf “Select all sub-elements” in “Action Tips”, dann auf “Extract data” und “Extract data in the loop”.
Die Schleife für das Scraping der Tabelle ist in den Workflow integriert.
**✅ Schritt 5:** Extrahieren und expotieren Sie die Daten
Mit den oben genannten 5 Schritten erhalten wir das folgende Ergebnis.
## Hier bekommen Sie Octoparse! 🤩
**Preis:** $0\~$249 pro Monat
**Packet & Preise:** [Octoparse Premium-Preise & Verpackung](https://www.octoparse.de/pricing)
**Kostenlose Testversion:**[ 14-tägige kostenlose Testversion](https://www.octoparse.de/signup)
**Herunterladen:** [Octoparse für Windows und MacOs](https://www.octoparse.de/download/windows)
# Google Sheets
In Google Sheets gibt es eine Funktion namens [***Import Html***](https://support.google.com/docs/answer/3093339?hl=en), mit der Daten aus einer Tabelle innerhalb einer HTML-Seite mit einem festen Ausdruck =ImportHtml (URL, “table”, num) extrahiert werden können.
**✅ Schritt 1:** Öffnen Sie ein neues Google Tabelle, und geben Sie den Ausdruck in ein leeres Feld ein. Es wird eine kurze Einführung in die Formel angezeigt.
**✅ Schritt 2:** Geben Sie die URL ein (Beispiel:[ https://en.wikipedia.org/wiki/Forbes%27\_list\_of\_the\_world%27s\_highest-paid\_athletes](https://en.wikipedia.org/wiki/Forbes%27_list_of_the_world%27s_highest-paid_athletes)) und passen Sie das Indexfeld nach Bedarf an.
Mit den oben genannten 2 Schritten können wir die Daten aus einer Tabelle innerhalb von Minuten mit Google Tabelle scrapen. Allerdings gibt es eine offensichtliche Einschränkung. Wir müssen den Prozess mehrmals wiederholen, wenn wir planen, Tabellen von mehrere Seiten mit Google Tabelle zu scrapen. Daher brauchen Sie eine effizientere Methode, um den Prozess zu automatisieren.
# Sprache R (mit rvest-Paket)
In diesem Fall verwende ich auch diese Website (https://de.investing.com/currencies/single-currency-crosses) als Beispiel, um zu zeigen, wie man Tabellen mit rvest scrapen kann.
Bevor wir mit dem Schreiben des Codes beginnen, müssen wir einige grundlegende Grammatiken über das rvest-Paket kennen.
html_nodes() : Auswahl eines bestimmten Teils in einem bestimmten Dokument. Wir können CSS-Selektoren verwenden, wie html_nodes(doc, "table td"), oder xpath-Selektoren, html_nodes(doc, xpath = "//table//td")
html_tag() : Extrahiert den Tag-Namen. Einige ähnliche sind html_text (), html_attr() und html_attrs()
html_table() : Parsen HTML-Tabellen und extrahieren die in R Framework.
Darüber hinaus gibt es noch einige Funktionen zur Simulation des menschlichen Surfverhaltens. Zum Beispiel html_session(), jump_to(), follow_link(), back(), forward(), submit_form() und so weiter.
In diesem Fall müssen wir html\_table() verwenden, um unser Ziel zu erreichen, also Daten aus einer Tabelle auszulesen.
Laden Sie zunächst R([https://cran.r-project.org/](https://cran.r-project.org/)) herunter.
**✅ Schritt 1:** Installieren Sie rvest.
https://preview.redd.it/ywykufq5nhfa1.png?width=558&format=png&auto=webp&v=enabled&s=37d8989dcaf70edfc67fa1f572ae59be6802bd80
**✅ Schritt 2:** Beginnen Sie mit dem Schreiben von Codes, wie in der folgenden Abbildung gezeigt.
Library(rvest) : Importieren Sie das rvest-Paket
Library(magrittr) : Importieren Sie das Paket magrittr
URL: Die Ziel-URL
Read HTML : Zugriff auf die Informationen der Ziel-URL
List: Lesen die Daten aus der Tabelle
**✅ Schritt 3:** Nachdem Sie den gesamten Code in das R-Penal geschrieben haben, klicken Sie auf “Enter”, um das Skript auszuführen. Jetzt können wir die Tabelleninformationen sofort erhalten.
Für die Menschen, die keine Programmierkenntnisse haben, ist die Programmierung mit einer steilen Lernkurve verbunden, die die Schwelle für den Einstieg in das Web Scraping erhöht. Es erschwert diese Menschen, einen Wettbewerbsvorteil bei der Nutzung von Webdaten zu erlangen.
Ich hoffe, dass das obige Tutorial Ihnen hilft, eine allgemeine Vorstellung davon zu bekommen, wie ein Web Scraping Tool Ihnen helfen kann, das gleiche Ergebnis wie ein Programmierer mühelos zu erreichen.
Wenn Sie Probleme bei der Datenextraktion haben, oder uns etwas Vorschlägen geben möchten, kontaktieren Sie bitte uns per E-Mail ([**support@octoparse.com**](mailto:support@octoparse.com)). 💬
Autor\*in: Das Octoparse Team ❤️
[ https:\/\/dataservice.octoparse.com\/de\/web-scraping-templates](https://preview.redd.it/492ce36anhfa1.png?width=800&format=png&auto=webp&v=enabled&s=e9a278abeb57c265b35d83b6fe05a9b013040ae4)
10qi03j
u_Octoparse_de
Octoparse_de
t3_10qi03j
https://www.reddit.com/r/u_Octoparse_de/comments/10qi03j/3_methoden_zum_scrapen_der_daten_aus_einer_tabelle/
2/1/2023 2:26:08 AM
2/1/2023 2:29:12 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
3 Methoden zum Scrapen der Daten aus einer Tabelle
False
1
10qi03j
0
100
1
1
4
0.444938820912125
30
3.33704115684093
0
0
483
53.726362625139
899
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
704
Posted
6/17/2022 1:17:46 AM
👏 Sommerverkauf 2022 noch weitergehend
30% NACHLASS beim Erneuern oder Upgraden
【Standard Jährlich】 Spare $201!
【Professional Jährlich】Spare $500!
👉 Nimm kostenlose Crawlers & 1-zu-1 Training: https://www.octoparse.de/summer-sale-2022
https://preview.redd.it/9rzwkxkt13691.png?width=800&format=png&auto=webp&v=enabled&s=b4c0ce5fc152b0244eb0af70690e891bc6374f22
ve12mg
u_Octoparse_de
Octoparse_de
t3_ve12mg
https://www.reddit.com/r/u_Octoparse_de/comments/ve12mg/summer_sale_2022/
6/17/2022 1:17:46 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Summer Sale 2022!!!
False
1
ve12mg
0
100
1
1
0
0
0
0
0
0
19
73.0769230769231
26
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
703
Posted
6/28/2022 1:40:07 AM
[\#SummerSale](https://twitter.com/hashtag/SummerSale?src=hashtag_click) [\#Sommer](https://twitter.com/hashtag/Sommer?src=hashtag_click)
⌛ Heute zum Ende!
Verkauf endet HEUTENACHT um 11:59 pm ETS.
✨ 30% Nachlass für Jährlichen Plan
✨ Erhalte KOSTENLISE kundenspezifische Crawlers & Training
Spare jetzt und gehe auf neuen Daten Scraping Reisen.
[https://www.octoparse.de/summer-sale-2022](https://www.octoparse.de/summer-sale-2022)
https://preview.redd.it/p7wmyi3un9891.png?width=800&format=png&auto=webp&v=enabled&s=697e35156b06447b946b06f72b4997d197c95faa
vmb7tp
u_Octoparse_de
Octoparse_de
t3_vmb7tp
https://www.reddit.com/r/u_Octoparse_de/comments/vmb7tp/summer_big_sale_for_web_scraping/
6/28/2022 1:40:07 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Summer Big Sale for Web Scraping!
False
1
vmb7tp
0
100
1
1
0
0
0
0
0
0
44
72.1311475409836
61
128, 128, 128
3.09366130558184
Dash Dot Dot
49.5985944046493
No
702
Posted
2/1/2023 2:43:25 AM
**Artikel Quelle:** [RegEx: Extrahieren aller Telefonnummern aus Zeichenketten|Octoparse Deutschland](https://octoparse.de/blog/regex-extrahieren-aller-telefonnummern-aus-zeichenketten)
Manchmal kann ein Regex-Tool Ihnen dabei helfen, die brauchenden Zeichen aus einer Zeichenketten abzurufen. In diesem Artikel geben wir Ihnen die grundlegenden Informationen von RegEx und dessen Verwendung.
Dann fangen wir an!
# Was ist RegEx?
In der allgemeinen Informatik ist eine Regular Expression (abgekürzt als RegEx oder RegExp) ein regulärer Ausdruck. Also eine Zeichenfolge, mit der eine Reihe von Zeichenfolgen mithilfe einiger Syntaxregeln beschrieben wird. Diese sind vor allem in der Softwareentwicklung und im Webdesign in Verwendung. RegEx ist zum Beispiel bei Anwendungen wichtig, die Eingaben des Benutzers erwarten. Das ist etwa bei bei Online-Formularen der Fall.
> *"Ein regulärer Ausdruck (englisch regular expression, Abkürzung RegExp oder RegEx) ist in der theoretischen Informatik eine Zeichenkette, die der Beschreibung von Mengen von Zeichenketten mithilfe bestimmter syntaktischer Regeln dient."*
**Wie können wir mithilfe eines regulären Ausdrucks Telefonnummern aus den Zeichenketten auslesen?**
In einigen Fällen sind die Telefonnummer mit anderen Informationen in einer Zeichenkette dargestellt. Wenn man nur die Infos von Telefonnummern auslesen möchte, soll man unbedingt das RegExp für den einmaligen Abruf aller Daten einschreiben, statt "Control + F", "Control + C" und "Control + V" für jede Daten zu tippen.
Mit RegEx kann man die Daten mit Gleichheiten, sowie die Nummern, Namen, Datum, dessen Koden in Formen keine Unterschiede haben, sehr einfach und schnell abrufen.
# Die RegEx-Grundregeln
Wenn Sie Telefonnummern mithilfe von RegEx Ausdrücken extrahieren möchten, aber nicht wissen, wie man einen solchen Ausdruck schreibt, kann Ihnen dieser Artikel dabei helfen.
Es kostet Zeit, RegEx von Grund auf zu erlernen. Wenn Sie RegEx jedoch häufig bei Ihrer täglichen Arbeit verwenden und dadurch Ihre Produktivität erheblich steigern können, lohnt es dann die Mühe.
&#x200B;
| **Zeichen**| **Erklärung**|
|:-|:-|
| \[abc\]| Mit den eckigen Klammern \[ und \] wird eine Zeichenauswahl definiert. Das Beispiel findet eines dieser Zeichen.|
| \[a-e\]| Ein Bindestrich definiert einen bestimmten Bereich. Das Beispiel findet die Zeichen a, b, c, d und e. Auch hier muss nur eines der Zeichen zutreffen.|
| \[a-zA-Z0-9\]|Innerhalb einer Zeichenauswahl können auch mehrere Gruppen und Einzelzeichen stehen. Im Beispiel entspricht die Zeichenauswahl den Kleinbuchstaben a bis z, den Großbuchstaben A bis Z und sowie den Ziffern 0 bis 9. |
| \[0-9\]|Der Bindestrich lässt sich auch nur auf Zahlen anwenden. Das Beispiel steht für die Ziffern 0 bis 9 |
| \[\^a\] |Durch das \^ Zeichen am Anfang einer Zeichenauswahl wird diese negiert. Das bedeutet, dass es jedes Zeichen finden würde bis auf das nach dem \^. |
| \^a |Steht dieses Zeichen nicht innerhalb einer Klammer, so bedeutet es, dass es für den Anfang eines Textes steht.|
| a$ |Dieses Zeichen steht für das Ende einer Zeile oder einer Zeichenkette.|
| . |Der Punkt steht für ein beliebiges Zeichen und kann somit jedes Zeichen finden.|
| a\* |Das Zeichen vor dem Stern darf beliebig oft vorkommen.|
| .\* |Punkt und Stern in Kombination findet X-beliebig viele Zeichen.|
| a+ |Das Zeichen vor dem + muss mindestens einmal vorkommen.|
| ab{2} |Die Buchstaben die davor stehen, müssen exakt 2 Mal gefunden werden.|
| ab? |Das Fragezeichen bedeutet, dass das Zeichen vorkommen kann aber es muss nicht vorkommen. |
| (a|A) |Die Pipe | agiert als ODER. Es darf nur eines der beiden Zeichen(-ketten) vorkommen. |
| $1 |Ist die Rückwärtsreferenz auf eine Gruppe bzw. ein Teilmuster. Vor allem für das Suchen und Ersetzen wichtig. $1 bezieht sich auf die erste Klammer- Gruppe. |
Da die Grundregeln für Einsteriger\*innen ziemlich kompliziert sind, bieten wir in diesem Artikel eine einfachere Methode. Wenn Sie die Vorteile von RegEx auf einfache Weise nutzen wollen, ist ein RegEx-Tool genau das Richtige für Sie.
# RegEx-Tool von Octoparse
Es gibt einige gebrauchsfertige Tools, die das Schreiben von RegEx vereinfachen können. [Octoparse ](https://www.octoparse.de/)hat ein eingebautes RegEx-Tool, um diese Aufgabe zu erledigen.
https://preview.redd.it/7entia9cphfa1.png?width=800&format=png&auto=webp&v=enabled&s=da467ef55577239975f7ccf0985d7269bf926c75
Mit diesem einfach zu bedienenden Tool brauchen Sie sich nur darum zu kümmern, das Muster der gesuchten Telefonnummern im Text zu finden.
# Hier bekommen Sie Octoparse! 🤩
**Preis:** $0\~$249 pro Monat
**Packet & Preise:** [Octoparse Premium-Preise & Verpackung](https://www.octoparse.de/pricing)
**Kostenlose Testversion:**[ 14-tägige kostenlose Testversion](https://www.octoparse.de/signup)
**Herunterladen:** [Octoparse für Windows und MacOs](https://www.octoparse.de/download/windows)
# Beispiele für Scrapen von Telefonnummern durch RegEx
Es kann sich um mehrere Telefonnummern in einer einzigen großen Zeichenkette handeln, und diese Telefonnummern können in verschiedenen Formaten vorliegen. Hier ist ein Beispiel für das Dateiformat:
(021)1234567
(123) 456 7899
(123).456.7899
(123)-456-7899
123-456-7899
123 456 7899
1234567899
0511-4405222
021-87888822
+8613012345678
Was ist die einfachste Methode, um Telefonnummer wie diese zu extrahieren? Jetzt verwenden wir das Tool, um reguläre Ausdrücke zu generieren und alle Telefonnummern schnell zu finden.
Finden Sie zunächst das gemeinsame Zeichen, mit dem jede Telefonnummer beginnt und endet. Zum Beispiel finde ich für den oben genannten Zieltext den unten dargestellten Quellcode.
<p>Hier ist ein Beispiel für ein Dateiformat </p>
<ul>
<li>(021)1234567 </li>
<li>(123) 456 7899 </li>
<li>(123).456.7899 </li>
<li>(123)-456-7899 </li>
<li>123-456-7899 </li>
<li>123 456 7899 </li>
<li>1234567899 </li>
<li>0511-4405222 </li>
<li>021-87888822 </li>
<li>+8613012345678 </li>
<li>... </li>
</ul>
**Jede Telefonnummer beginnt mit <li> und endet mit </li>.**
Und wir können das RegEx-Tool in Octoparse verwenden, um schnell alle Telefonnummern zu extrahieren.
✅ Starten Sie Octoparse und öffnen Sie das RegEx-Tool.
✅ Kopieren Sie den Quellcode und fügen Sie ihn in das Feld **"Original Text"** ein.
✅ Wählen Sie dann die Option **"Start with"** und geben Sie **"<li>"** ein.
✅ Wählen Sie dann die Option **"End with"** und geben Sie **"</li>"** ein.
✅ Vergessen Sie nicht, die Option **"Match All"** zu wählen.
✅ Klicken Sie auf **"Match"**.
https://preview.redd.it/jkfrph0mphfa1.png?width=800&format=png&auto=webp&v=enabled&s=36de77db697d9c0365930e75c00ee1051f5f6060
https://preview.redd.it/g17xdqtmphfa1.png?width=450&format=png&auto=webp&v=enabled&s=99e322a048ddc1d86dd37b87f1a8861ec671b375
Wenn Sie Probleme bei der Datenextraktion haben, oder uns etwas Vorschlägen geben möchten, kontaktieren Sie bitte uns per E-Mail ([**support@octoparse.com**](mailto:support@octoparse.com)). 💬
Autor\*in: Das Octoparse Team ❤️
[https:\/\/dataservice.octoparse.com\/de\/web-scraping-templates](https://preview.redd.it/rvsjfpbpphfa1.png?width=1056&format=png&auto=webp&v=enabled&s=5f1ec2b18d5d55d90954c005eb30b93eaf2b9d7a)
10qidf6
u_Octoparse_de
Octoparse_de
t3_10qidf6
https://www.reddit.com/r/u_Octoparse_de/comments/10qidf6/regex_extrahieren_aller_telefonnummern_aus/
2/1/2023 2:43:25 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
RegEx: Extrahieren aller Telefonnummern aus Zeichenketten
False
1
10qidf6
0
100
1
1
2
0.187793427230047
28
2.62910798122066
0
0
610
57.2769953051643
1065
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
701
Posted
4/29/2023 11:10:02 AM
[removed]
132q5kh
excel
Individual_Side313
t3_132q5kh
https://www.reddit.com/r/excel/comments/132q5kh/how_to_use_octoparse_to_extract_data_to_excel/
4/29/2023 11:10:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to use octoparse to extract data to excel including details of each data from a powerbi on a website
False
1
132q5kh
0
4
9
9
0
0
0
0
0
0
1
100
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
700
Posted
4/29/2023 11:11:44 AM
How to use octoparse to extract all the data including details of each data into excel from a powerbi on a website
Website: https://textileexchange.org/find-certified-company/
132q6r7
excel
Individual_Side313
t3_132q6r7
https://www.reddit.com/r/excel/comments/132q6r7/extract_data_into_excel/
4/29/2023 11:11:44 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Extract data into excel
False
0.67
132q6r7
0
4
9
9
1
4.34782608695652
0
0
0
0
11
47.8260869565217
23
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
699
Commented
4/29/2023 11:10:03 AM
This post has been removed due to **[Rule 2](/r/excel/wiki/sharingquestions#wiki_submission_rules) - Poor Post Body.**
Please post with a proper description in the body of your post.
The body of your post should be a detailed description of your problem. Providing [samples of your data](/r/excel/wiki/sharingquestions#wiki_posting_your_data) is always a good idea as well.
Putting your whole question in the title, and then saying the title says it all is not a sufficient post.
Links to your file, screenshots and/or video of the problem should be done to help illustrate your question. Those things **should not be your question**.
Here's a [long example](/r/excel/comments/6jg5hz/i_have_two_columns_which_specify_the_start_cell/?st=j5z4ms4n&sh=c7e9fe1d) and a [short example](/r/excel/comments/6rnz8l/conditional_formatting_excluding_empty_cells/?st=j5z4a7k1&sh=a6119df2) of good posts.
Rules are enforced to promote high quality posts for the community and to ensure questions can be easily navigated and referenced for future use. See the [Posting Guidelines](/r/excel/wiki/sharingquestions) for more details, and tips on how to make great posts.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/excel) if you have any questions or concerns.*
ji65pl2
excel
AutoModerator
t1_ji65pl2
https://www.reddit.com/r/excel/comments/132q5kh/how_to_use_octoparse_to_extract_data_to_excel/ji65pl2/
4/29/2023 11:10:03 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
132q5kh
t3_132q5kh
132q5kh
0
132q5kh
False
False
True
0
4
9
9
12
5.8252427184466
4
1.94174757281553
0
0
90
43.6893203883495
206
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
698
Commented
4/29/2023 11:11:44 AM
/u/Individual_Side313 - Your post was submitted successfully.
* Once your problem is solved, reply to the **answer(s)** saying `Solution Verified` to close the thread.
* Follow the **[submission rules](/r/excel/wiki/sharingquestions)** -- particularly 1 and 2. To fix the body, click edit. To fix your title, delete and re-post.
* Include your **[Excel version and all other relevant information](/r/excel/wiki/sharingquestions#wiki_give_all_relevant_information)**
Failing to follow these steps may result in your post being removed without warning.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/excel) if you have any questions or concerns.*
ji65ufb
excel
AutoModerator
t1_ji65ufb
https://www.reddit.com/r/excel/comments/132q6r7/extract_data_into_excel/ji65ufb/
4/29/2023 11:11:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
132q6r7
t3_132q6r7
132q6r7
0
132q6r7
False
False
True
0
4
9
9
5
4.67289719626168
4
3.73831775700935
0
0
49
45.7943925233645
107
128, 128, 128
3
Solid
50
No
676
Posted
12/1/2021 9:33:21 AM
Web scraping is the process of collecting structured web data in an automated fashion. It’s also called web data extraction. Some of the main use cases of web scraping include price monitoring, price intelligence, news monitoring, lead generation, and market research among many others.
In general, web data extraction is used by people and businesses who want to make use of the vast amount of publicly available web data to make smarter decisions. After trying so many scrapping tools, I find these 5 works best for any web data extraction.
* [ScraperAPI](https://www.scraperapi.com/?fp_ref=polasr7)
* [Octoparse](https://www.octoparse.com/)
* [Srapingbee](https://www.scrapingbee.com/)
* Zyte
* Parsehub
Happy Web scraping.
r6azor
u_ScrapperExpert
ScrapperExpert
t3_r6azor
https://www.reddit.com/r/u_ScrapperExpert/comments/r6azor/what_are_the_best_5_web_scrapping_apitool_to/
12/1/2021 9:33:21 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What are the best 5 Web Scrapping API/Tool to scrape data?
False
1
r6azor
0
1
51
51
7
6.19469026548673
0
0
0
0
65
57.5221238938053
113
128, 128, 128
3
Solid
50
No
675
Commented
12/7/2021 6:54:58 AM
[Wersel Data-Hub](https://wersel.io/) is an enterprise-focused [web scraping tool](https://wersel.io/). With its integrated tools, you can create your own web scraping agents. It has a lot of flexibility when it comes to dealing with complicated websites and data extraction.
**Visit their official site to book a free demo -** [**https://wersel.io/**](https://www.wersel.io)
hnkeiry
u_ScrapperExpert
Helenawilliam92
t1_hnkeiry
https://www.reddit.com/r/u_ScrapperExpert/comments/r6azor/what_are_the_best_5_web_scrapping_apitool_to/hnkeiry/
12/7/2021 6:54:58 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
r6azor
t3_r6azor
r6azor
0
r6azor
False
False
False
0
1
51
51
3
4.91803278688525
1
1.63934426229508
0
0
32
52.4590163934426
61
128, 128, 128
3
Solid
50
No
672
Commented
4/22/2020 4:44:42 AM
If you continue to have trouble with Octoparse, check out our (free) Octoparse alternative here: [https://rsx.inferlink.com/](https://rsx.inferlink.com/#/)
fo5o5n1
webscraping
eghantous
t1_fo5o5n1
https://www.reddit.com/r/webscraping/comments/g465al/is_octoparse_down/fo5o5n1/
4/22/2020 4:44:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g465al
t3_g465al
g465al
0
g465al
False
False
False
0
1
6
6
1
4.34782608695652
1
4.34782608695652
0
0
9
39.1304347826087
23
128, 128, 128
3
Solid
50
No
668
Commented
4/22/2020 4:43:49 AM
I think it depends on what you're looking for and how much time you have on your hands. Those tools you listed have different price points - do you have a budget in mind? If you're looking for a free tool, check out ours: [https://rsx.inferlink.com](https://rsx.inferlink.com/#/)
fo5o38n
webscraping
eghantous
t1_fo5o38n
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/fo5o38n/
4/22/2020 4:43:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g45iym
t3_g45iym
g45iym
0
g45iym
False
False
False
0
1
6
6
1
1.96078431372549
0
0
0
0
22
43.1372549019608
51
128, 128, 128
3
Solid
50
No
667
Posted
12/2/2016 7:20:14 AM
http://www.octoparse.com/blog/a-must-have-web-scraper-for-data-comparison-software-octoparse/
5g20qw
InternetIsBeautiful
M_Johny
t3_5g20qw
https://www.reddit.com/r/InternetIsBeautiful/comments/5g20qw/a_musthave_web_scraper_for_data_comparison/
12/2/2016 7:20:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
A Must-Have Web Scraper for Data Comparison Software - Octoparse
False
1
5g20qw
0
1
1
1
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
666
Commented
2/18/2022 11:50:09 AM
Bdw, big Love😍🤌 for your product
hxfoql4
Octoparse_ideas
User-new-wth
t1_hxfoql4
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxfoql4/
2/18/2022 11:50:09 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
1
16.6666666666667
0
0
0
0
3
50
6
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
665
RepliedTo
2/19/2022 2:43:44 AM
Thanks for your support!
hxj4xcf
Octoparse_ideas
Octoparseideas
t1_hxj4xcf
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxj4xcf/
2/19/2022 2:43:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxfoql4
t1_hxfoql4
hxfoql4
0
pk0ql2
True
False
False
1
196
3
3
1
25
0
0
0
0
1
25
4
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
664
Commented
2/18/2022 11:48:34 AM
Please ad it to the new release, it’s a really helpfull option, this way we can start a local task wich require a Captcha and later we can disable image loading.
hxfolkg
Octoparse_ideas
User-new-wth
t1_hxfolkg
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxfolkg/
2/18/2022 11:48:34 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
0
0
1
3.125
0
0
16
50
32
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
663
RepliedTo
2/19/2022 2:43:30 AM
We will definitely take it into consideration then\~
hxj4waq
Octoparse_ideas
Octoparseideas
t1_hxj4waq
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxj4waq/
2/19/2022 2:43:30 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxfolkg
t1_hxfolkg
hxfolkg
0
pk0ql2
True
False
False
1
196
3
3
0
0
0
0
0
0
3
37.5
8
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
662
Commented
2/17/2022 4:27:27 PM
I know that one, I was talking about disable it on the local run window.
Ex:
Start local run
Complete chaptcha
Click Run Settings button
Pop up apears and you can select there “Do not Load images in local run”
This button is only available on Octoparse 8.4 (local run window)
On 8.5 now only shows “show browser and Edit task”
hxbpeyr
Octoparse_ideas
User-new-wth
t1_hxbpeyr
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxbpeyr/
2/17/2022 4:27:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
1
1.58730158730159
1
1.58730158730159
0
0
33
52.3809523809524
63
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
661
RepliedTo
2/18/2022 1:18:55 AM
Hi, it is true that the option is removed in 8.5.
hxdz1qn
Octoparse_ideas
Octoparseideas
t1_hxdz1qn
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxdz1qn/
2/18/2022 1:18:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxbpeyr
t1_hxbpeyr
hxbpeyr
0
pk0ql2
True
False
False
1
196
3
3
0
0
0
0
0
0
4
33.3333333333333
12
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
660
Commented
2/17/2022 6:40:27 AM
After starting a local run Sometimes you start the scrape as normal so you can complete the captcha, after captcha you go to settings (topright corner) and check the “Do not load images in local run” so you can speed up proces and save some data usage”
hxa1hp9
Octoparse_ideas
User-new-wth
t1_hxa1hp9
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxa1hp9/
2/17/2022 6:40:27 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
2
pk0ql2
False
False
False
0
196
3
3
0
0
0
0
0
0
24
51.063829787234
47
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
659
RepliedTo
2/17/2022 8:03:59 AM
Click on the "settings" symbol beside the browse symbol, and you will be able to see the checkbox "Disable image loading" in the "Run Settings" section.
hxa86po
Octoparse_ideas
Octoparseideas
t1_hxa86po
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxa86po/
2/17/2022 8:03:59 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxa1hp9
t1_hxa1hp9
hxa1hp9
0
pk0ql2
True
False
False
1
196
3
3
0
0
1
3.84615384615385
0
0
13
50
26
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
658
RepliedTo
2/17/2022 7:59:11 AM
The checkbox is still available in "task settings".
hxa7tfy
Octoparse_ideas
Octoparseideas
t1_hxa7tfy
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxa7tfy/
2/17/2022 7:59:11 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxa1hp9
t1_hxa1hp9
hxa1hp9
0
pk0ql2
True
False
False
1
196
3
3
1
12.5
0
0
0
0
4
50
8
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
657
Commented
2/17/2022 6:36:40 AM
Image loading- checkbox after start in local runs.
“Do not load images in local runs”
hxa15za
Octoparse_ideas
User-new-wth
t1_hxa15za
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxa15za/
2/17/2022 6:36:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
0
pk0ql2
False
False
False
0
196
3
3
0
0
0
0
0
0
10
66.6666666666667
15
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
656
Commented
2/16/2022 10:44:41 PM
Any ideea how you can disable “image loading” after you start the task in local run after update to latesc Octoparse (before you had an option button on the right top window) now its gone.
hx8g31l
Octoparse_ideas
User-new-wth
t1_hx8g31l
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hx8g31l/
2/16/2022 10:44:41 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
2
5.71428571428571
1
2.85714285714286
0
0
15
42.8571428571429
35
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
655
RepliedTo
2/17/2022 1:19:11 AM
Hi, what do you mean by "image loading"? Are you referring to the "Image URL"?
hx91dbu
Octoparse_ideas
Octoparseideas
t1_hx91dbu
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hx91dbu/
2/17/2022 1:19:11 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hx8g31l
t1_hx8g31l
hx8g31l
0
pk0ql2
True
False
False
1
196
3
3
0
0
0
0
0
0
7
46.6666666666667
15
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
654
Commented
2/9/2022 10:19:00 PM
Any ideea how you can insert different text in loop URL’s?
Ex: Go to Loop URL1 and insert text1
Go to Loop URL2 and insert text2.
Really need help with this guys!!😁🙏
hwaaudu
Octoparse_ideas
User-new-wth
t1_hwaaudu
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hwaaudu/
2/9/2022 10:19:00 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
0
0
0
0
0
0
21
63.6363636363636
33
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
653
RepliedTo
2/10/2022 1:12:37 AM
Hi, please submit a ticket here [https://helpcenter.octoparse.com/hc/en-us/requests/new](https://helpcenter.octoparse.com/hc/en-us/requests/new) and the customer service team will help you.
hwb02er
Octoparse_ideas
Octoparseideas
t1_hwb02er
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hwb02er/
2/10/2022 1:12:37 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hwaaudu
t1_hwaaudu
hwaaudu
0
pk0ql2
True
False
False
1
196
3
3
0
0
0
0
0
0
16
50
32
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
652
Commented
2/18/2022 11:50:09 AM
Bdw, big Love😍🤌 for your product
hxfoql4
Octoparse_ideas
User-new-wth
t1_hxfoql4
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxfoql4/
2/18/2022 11:50:09 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
651
RepliedTo
2/19/2022 2:43:44 AM
Thanks for your support!
hxj4xcf
Octoparse_ideas
Octoparseideas
t1_hxj4xcf
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxj4xcf/
2/19/2022 2:43:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxfoql4
t1_hxfoql4
hxfoql4
0
pk0ql2
True
False
False
1
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
650
Commented
2/18/2022 11:48:34 AM
Please ad it to the new release, it’s a really helpfull option, this way we can start a local task wich require a Captcha and later we can disable image loading.
hxfolkg
Octoparse_ideas
User-new-wth
t1_hxfolkg
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxfolkg/
2/18/2022 11:48:34 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
649
RepliedTo
2/19/2022 2:43:30 AM
We will definitely take it into consideration then\~
hxj4waq
Octoparse_ideas
Octoparseideas
t1_hxj4waq
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxj4waq/
2/19/2022 2:43:30 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxfolkg
t1_hxfolkg
hxfolkg
0
pk0ql2
True
False
False
1
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
648
Commented
2/17/2022 4:27:27 PM
I know that one, I was talking about disable it on the local run window.
Ex:
Start local run
Complete chaptcha
Click Run Settings button
Pop up apears and you can select there “Do not Load images in local run”
This button is only available on Octoparse 8.4 (local run window)
On 8.5 now only shows “show browser and Edit task”
hxbpeyr
Octoparse_ideas
User-new-wth
t1_hxbpeyr
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxbpeyr/
2/17/2022 4:27:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
647
RepliedTo
2/18/2022 1:18:55 AM
Hi, it is true that the option is removed in 8.5.
hxdz1qn
Octoparse_ideas
Octoparseideas
t1_hxdz1qn
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxdz1qn/
2/18/2022 1:18:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxbpeyr
t1_hxbpeyr
hxbpeyr
0
pk0ql2
True
False
False
1
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
646
Commented
2/17/2022 6:40:27 AM
After starting a local run Sometimes you start the scrape as normal so you can complete the captcha, after captcha you go to settings (topright corner) and check the “Do not load images in local run” so you can speed up proces and save some data usage”
hxa1hp9
Octoparse_ideas
User-new-wth
t1_hxa1hp9
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxa1hp9/
2/17/2022 6:40:27 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
2
pk0ql2
False
False
False
0
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
645
RepliedTo
2/17/2022 8:03:59 AM
Click on the "settings" symbol beside the browse symbol, and you will be able to see the checkbox "Disable image loading" in the "Run Settings" section.
hxa86po
Octoparse_ideas
Octoparseideas
t1_hxa86po
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxa86po/
2/17/2022 8:03:59 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxa1hp9
t1_hxa1hp9
hxa1hp9
0
pk0ql2
True
False
False
1
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
644
RepliedTo
2/17/2022 7:59:11 AM
The checkbox is still available in "task settings".
hxa7tfy
Octoparse_ideas
Octoparseideas
t1_hxa7tfy
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxa7tfy/
2/17/2022 7:59:11 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxa1hp9
t1_hxa1hp9
hxa1hp9
0
pk0ql2
True
False
False
1
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
643
Commented
2/17/2022 6:36:40 AM
Image loading- checkbox after start in local runs.
“Do not load images in local runs”
hxa15za
Octoparse_ideas
User-new-wth
t1_hxa15za
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hxa15za/
2/17/2022 6:36:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
0
pk0ql2
False
False
False
0
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
642
Commented
2/16/2022 10:44:41 PM
Any ideea how you can disable “image loading” after you start the task in local run after update to latesc Octoparse (before you had an option button on the right top window) now its gone.
hx8g31l
Octoparse_ideas
User-new-wth
t1_hx8g31l
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hx8g31l/
2/16/2022 10:44:41 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
641
RepliedTo
2/17/2022 1:19:11 AM
Hi, what do you mean by "image loading"? Are you referring to the "Image URL"?
hx91dbu
Octoparse_ideas
Octoparseideas
t1_hx91dbu
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hx91dbu/
2/17/2022 1:19:11 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hx8g31l
t1_hx8g31l
hx8g31l
0
pk0ql2
True
False
False
1
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
640
Commented
2/9/2022 10:19:00 PM
Any ideea how you can insert different text in loop URL’s?
Ex: Go to Loop URL1 and insert text1
Go to Loop URL2 and insert text2.
Really need help with this guys!!😁🙏
hwaaudu
Octoparse_ideas
User-new-wth
t1_hwaaudu
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hwaaudu/
2/9/2022 10:19:00 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pk0ql2
t3_pk0ql2
pk0ql2
1
pk0ql2
False
False
False
0
196
3
3
131, 125, 125
3.1844843897824
Dash Dot Dot
49.2093526152183
Yes
639
RepliedTo
2/10/2022 1:12:37 AM
Hi, please submit a ticket here [https://helpcenter.octoparse.com/hc/en-us/requests/new](https://helpcenter.octoparse.com/hc/en-us/requests/new) and the customer service team will help you.
hwb02er
Octoparse_ideas
Octoparseideas
t1_hwb02er
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/hwb02er/
2/10/2022 1:12:37 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hwaaudu
t1_hwaaudu
hwaaudu
0
pk0ql2
True
False
False
1
196
3
3
128, 128, 128
3
Solid
50
Yes
621
Commented
1/9/2021 7:49:15 PM
I’ve never used Octoparse before but I use ParseHub and that’s super easy. They even have a YouTube channel explaining how to use the tool. May be worth switching. Here’s the link to them scrapping Yellow pages https://youtu.be/dhvAz7ejc-M
giop8lv
sweatystartup
AnyHoneydew4
t1_giop8lv
https://www.reddit.com/r/sweatystartup/comments/ktexr1/how_do_i_use_a_data_scraping_tool_like_octoparse/giop8lv/
1/9/2021 7:49:15 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ktexr1
t3_ktexr1
ktexr1
1
ktexr1
False
False
False
0
1
52
52
3
7.5
0
0
0
0
16
40
40
128, 128, 128
3
Solid
50
Yes
620
RepliedTo
1/14/2021 8:59:38 PM
Got it. This seems like a much better option I’m gonna give it a try. Thank you!
gj9qkhe
sweatystartup
databoy-thatsme
t1_gj9qkhe
https://www.reddit.com/r/sweatystartup/comments/ktexr1/how_do_i_use_a_data_scraping_tool_like_octoparse/gj9qkhe/
1/14/2021 8:59:38 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
giop8lv
t1_giop8lv
giop8lv
0
ktexr1
True
False
False
1
1
52
52
2
11.1111111111111
0
0
0
0
6
33.3333333333333
18
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
623
Posted
1/8/2021 11:42:35 PM
First off thanks for reading this. I tried to this myself with my extremely minimal coding experience and wasn’t able to figure it out.
There is a website that has a directory for companies I want to contact. It is not a major company website.
I want to use a bot to click on each link in the directory, then record their email, & phone number so I can call them/email them.
There are thousands of entries, so it would take too long to do it myself (I tried.. lol)
Any help on this would be much appreciated.
ktexr1
sweatystartup
databoy-thatsme
t3_ktexr1
https://www.reddit.com/r/sweatystartup/comments/ktexr1/how_do_i_use_a_data_scraping_tool_like_octoparse/
1/8/2021 11:42:35 PM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How do I use a data scraping tool like Octoparse to pull data from a directory?
False
0.57
ktexr1
0
4
52
52
1
1
0
0
0
0
39
39
100
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
622
Posted
1/8/2021 11:46:44 PM
[removed]
ktf0q0
datascience
databoy-thatsme
t3_ktf0q0
https://www.reddit.com/r/datascience/comments/ktf0q0/new_to_data_science_i_am_trying_to_use/
1/8/2021 11:46:44 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
New to Data science. I am trying to use Octoparse/other softwares to scrape data from a directory.
False
1
ktf0q0
0
4
52
52
0
0
0
0
0
0
1
100
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
617
Commented
8/18/2021 6:04:55 AM
If the scale of your scraping is big, you should try out this company datamam.com I had 4 sites to collect information from and spend a lot of time on it with that kind of applications. Then I just used their service and I had the job done in 3 days and now I receive data every week. Shortly if the scale is big let them do it.
h9dq3bc
webscraping
Tsk4ro
t1_h9dq3bc
https://www.reddit.com/r/webscraping/comments/p6hwen/anyone_know_how_to_use_octoparse/h9dq3bc/
8/18/2021 6:04:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6hwen
t3_p6hwen
p6hwen
1
p6hwen
False
False
False
0
4
53
53
0
0
0
0
0
0
25
36.231884057971
69
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
616
RepliedTo
8/18/2021 9:08:37 PM
It’s not huge
h9gcoh6
webscraping
Nivvy_Miz
t1_h9gcoh6
https://www.reddit.com/r/webscraping/comments/p6hwen/anyone_know_how_to_use_octoparse/h9gcoh6/
8/18/2021 9:08:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9dq3bc
t1_h9dq3bc
h9dq3bc
0
p6hwen
True
False
False
1
4
53
53
0
0
0
0
0
0
1
25
4
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
615
Commented
8/18/2021 6:04:55 AM
If the scale of your scraping is big, you should try out this company datamam.com I had 4 sites to collect information from and spend a lot of time on it with that kind of applications. Then I just used their service and I had the job done in 3 days and now I receive data every week. Shortly if the scale is big let them do it.
h9dq3bc
webscraping
Tsk4ro
t1_h9dq3bc
https://www.reddit.com/r/webscraping/comments/p6hwen/anyone_know_how_to_use_octoparse/h9dq3bc/
8/18/2021 6:04:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p6hwen
t3_p6hwen
p6hwen
1
p6hwen
False
False
False
0
4
53
53
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
614
RepliedTo
8/18/2021 9:08:37 PM
It’s not huge
h9gcoh6
webscraping
Nivvy_Miz
t1_h9gcoh6
https://www.reddit.com/r/webscraping/comments/p6hwen/anyone_know_how_to_use_octoparse/h9gcoh6/
8/18/2021 9:08:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9dq3bc
t1_h9dq3bc
h9dq3bc
0
p6hwen
True
False
False
1
4
53
53
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
619
Posted
8/18/2021 2:10:07 AM
I need to do some web scraping so I have been using a program called octoparse, which I’ve used in the past. I’ve come across an issue though and after many hours I still don’t understand what’s wrong. I don’t know how to code so python isn’t an option. I was wondering if anyone knows octoparse and could quickly look at my workflow because it’s probably a stupid error or something
p6hwen
webscraping
Nivvy_Miz
t3_p6hwen
https://www.reddit.com/r/webscraping/comments/p6hwen/anyone_know_how_to_use_octoparse/
8/18/2021 2:10:07 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Anyone know how to use octoparse?
False
1
p6hwen
0
4
53
53
0
0
4
5.12820512820513
0
0
32
41.025641025641
78
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
618
Posted
8/18/2021 2:10:07 AM
I need to do some web scraping so I have been using a program called octoparse, which I’ve used in the past. I’ve come across an issue though and after many hours I still don’t understand what’s wrong. I don’t know how to code so python isn’t an option. I was wondering if anyone knows octoparse and could quickly look at my workflow because it’s probably a stupid error or something
p6hwen
webscraping
Nivvy_Miz
t3_p6hwen
https://www.reddit.com/r/webscraping/comments/p6hwen/anyone_know_how_to_use_octoparse/
8/18/2021 2:10:07 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Anyone know how to use octoparse?
False
1
p6hwen
0
4
53
53
128, 128, 128
3
Solid
50
Yes
607
Commented
6/23/2021 11:21:48 PM
It’s been a while since I’ve done scraping but is beautiful soup not a thing anymore?
h2tq1v1
datascience
guattarist
t1_h2tq1v1
https://www.reddit.com/r/datascience/comments/o6nn4k/top_5_scraping_tools_for_beginners_importio_vs/h2tq1v1/
6/23/2021 11:21:48 PM
1/1/0001 12:00:00 AM
False
False
4
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o6nn4k
t3_o6nn4k
o6nn4k
1
o6nn4k
False
False
False
0
1
2
2
1
5.55555555555556
0
0
0
0
6
33.3333333333333
18
128, 128, 128
3
Solid
50
Yes
606
RepliedTo
6/24/2021 12:45:14 AM
Sure, but in this comparison I have selected GUI-friendly tools for people who are not yet programming experts. Beautiful soup is a console tool :)
h2tzgp7
datascience
carlpaul153
t1_h2tzgp7
https://www.reddit.com/r/datascience/comments/o6nn4k/top_5_scraping_tools_for_beginners_importio_vs/h2tzgp7/
6/24/2021 12:45:14 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h2tq1v1
t1_h2tq1v1
h2tq1v1
0
o6nn4k
True
False
False
1
1
2
2
2
8
0
0
0
0
11
44
25
128, 128, 128
3
Solid
50
No
605
Posted
3/23/2023 2:42:02 AM
https://startupbuffer.com/startup/octoparse
11z64hm
startupbuffer
makyol48
t3_11z64hm
https://www.reddit.com/r/startupbuffer/comments/11z64hm/octoparse_1_web_scraping_services_free_data/
3/23/2023 2:42:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse: #1 Web Scraping Services & Free Data Extraction Tool|Octoparse, Free Web Scraping
False
1
11z64hm
0
1
1
1
128, 128, 128
3
Solid
50
No
603
Commented
2/11/2023 2:47:22 PM
If you are not happy paying for the service, then you can try [MrScraper.com](https://MrScraper.com) instead, there's a FREE tier (disclaimer, I'm the founder).
And if you want a cheap pro account, use "MRDISCOUNT" when checking out for 40% off.
j846sf8
webscraping
Vanlombardi
t1_j846sf8
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j846sf8/
2/11/2023 2:47:22 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10z2cbs
t3_10z2cbs
10z2cbs
0
10z2cbs
False
False
False
0
1
4
4
2
4.65116279069767
1
2.32558139534884
0
0
15
34.8837209302326
43
128, 128, 128
3
Solid
50
No
602
Commented
2/11/2023 3:42:34 AM
People think they cancel stuff and in reality they didn’t. Just email them asking them to refund a likely mistake rather than falsely claim a company is a scam
j82jjr7
webscraping
Crypto_Eagle
t1_j82jjr7
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j82jjr7/
2/11/2023 3:42:34 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10z2cbs
t3_10z2cbs
10z2cbs
0
10z2cbs
False
False
False
0
1
4
4
1
3.33333333333333
3
10
0
0
10
33.3333333333333
30
128, 128, 128
3
Solid
50
No
601
Commented
2/10/2023 10:17:18 PM
I had the same experience with Paramount+. Canceled before trial and charged me with their monthly fee. I tried to avoid any trial so that I do not have to deal with them. Canceled testing mom after a few years of use. Canceled it successfully but still charge another year.
Seems like a trend for their revenue. Finally I had to cancel my credit card.
j81d98y
webscraping
gnobile
t1_j81d98y
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j81d98y/
2/10/2023 10:17:18 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10z2cbs
t3_10z2cbs
10z2cbs
1
10z2cbs
False
False
False
0
1
4
4
2
3.07692307692308
0
0
0
0
30
46.1538461538462
65
128, 128, 128
3
Solid
50
No
600
RepliedTo
2/11/2023 10:03:14 AM
We QuickScraper are providing a free plan with no credit card. We also support data extraction directly in JSON, CSV & Excel and. If you are interested feel free to contact us.
https://quickscraper.co/pricing/
j83fzl3
webscraping
bhushankumar_fst
t1_j83fzl3
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j83fzl3/
2/11/2023 10:03:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j81d98y
t1_j81d98y
j81d98y
0
10z2cbs
False
False
False
1
1
4
4
4
12.9032258064516
0
0
0
0
13
41.9354838709677
31
128, 128, 128
3
Solid
50
No
595
RepliedTo
2/10/2023 9:01:11 PM
Try [Privacy](https://privacy.com/) next time. It’s free and has saved me a lot of unexpected charges
j811w05
webscraping
--silas--
t1_j811w05
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j811w05/
2/10/2023 9:01:11 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j811483
t1_j811483
j811483
0
10z2cbs
False
False
False
4
1
4
4
1
5.26315789473684
1
5.26315789473684
0
0
8
42.1052631578947
19
128, 128, 128
3
Solid
50
No
604
Posted
2/10/2023 8:37:06 PM
I tried their free-trial. Not good. So I discontinued it before it renewed.
Guess what? They still charged my card $89.00.
Now I need to call my banks.
10z2cbs
webscraping
FantomHouse
t3_10z2cbs
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/
2/10/2023 8:37:06 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse is a scam. Avoid them.
False
0.5
10z2cbs
0
1
4
4
3
10
1
3.33333333333333
0
0
12
40
30
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
599
Commented
2/10/2023 8:42:13 PM
not enough information to judge , btw I'm not their user and i will never be
j80yz7r
webscraping
trafalgarDxlaw
t1_j80yz7r
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j80yz7r/
2/10/2023 8:42:13 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10z2cbs
t3_10z2cbs
10z2cbs
1
10z2cbs
False
False
False
0
2
4
4
1
6.66666666666667
0
0
0
0
5
33.3333333333333
15
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
598
RepliedTo
2/10/2023 8:45:25 PM
Well.. there isn't really much I can provide except for the bank transaction record. I even disabled the account (or deleted?) when I discontinued, so I don't even have access to my acc either.
j80zgkf
webscraping
FantomHouse
t1_j80zgkf
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j80zgkf/
2/10/2023 8:45:25 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j80yz7r
t1_j80yz7r
j80yz7r
1
10z2cbs
True
False
False
1
4
4
4
1
2.94117647058824
2
5.88235294117647
0
0
11
32.3529411764706
34
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
597
RepliedTo
2/10/2023 8:50:21 PM
that's why i avoid using my bank account directly i always use PayPal to avoid this kind of scam
j81080e
webscraping
trafalgarDxlaw
t1_j81080e
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j81080e/
2/10/2023 8:50:21 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j80zgkf
t1_j80zgkf
j80zgkf
1
10z2cbs
False
False
False
2
2
4
4
0
0
1
5.26315789473684
0
0
10
52.6315789473684
19
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
596
RepliedTo
2/10/2023 8:56:09 PM
Yeah.. I used my credit card at least.. can call bank.
j811483
webscraping
FantomHouse
t1_j811483
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j811483/
2/10/2023 8:56:09 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j81080e
t1_j81080e
j81080e
1
10z2cbs
True
False
False
3
4
4
4
0
0
0
0
0
0
6
54.5454545454545
11
128, 128, 128
3.01513718070009
Dash Dot Dot
49.9351263684282
Yes
594
Commented
2/10/2023 8:41:18 PM
You didn't provide a lot of information to go by. What exactly makes you think Octoparse is a scam?
j80yu7p
webscraping
GullibleEngineer4
t1_j80yu7p
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j80yu7p/
2/10/2023 8:41:18 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10z2cbs
t3_10z2cbs
10z2cbs
1
10z2cbs
False
False
False
0
17
4
4
0
0
1
5.26315789473684
0
0
8
42.1052631578947
19
128, 128, 128
3.01419110690634
Dash Dot Dot
49.9391809704014
Yes
593
RepliedTo
2/10/2023 8:48:53 PM
>What made me think Octoparse is a scam? Uh, read the post plz. There isn't really much I can provide except for the bank transaction record. I even disabled the account (or deleted?) when I discontinued, so I don't even have access to my acc either.
j80zzxs
webscraping
FantomHouse
t1_j80zzxs
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j80zzxs/
2/10/2023 8:48:53 PM
1/1/0001 12:00:00 AM
False
False
-1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j80yu7p
t1_j80yu7p
j80yu7p
1
10z2cbs
True
False
False
1
16
4
4
0
0
3
6.52173913043478
0
0
17
36.9565217391304
46
128, 128, 128
3.01513718070009
Dash Dot Dot
49.9351263684282
Yes
592
RepliedTo
2/10/2023 8:51:52 PM
You said you didn't like them (not good) but that doesn't equate to a scam.
Of course they will charge you because it is a paid service.
Btw, I don't have any affiliation with them nor do I use their service.
j810gex
webscraping
GullibleEngineer4
t1_j810gex
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j810gex/
2/10/2023 8:51:52 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j80zzxs
t1_j80zzxs
j80zzxs
1
10z2cbs
False
False
False
2
17
4
4
1
2.4390243902439
1
2.4390243902439
0
0
9
21.9512195121951
41
128, 128, 128
3.01419110690634
Dash Dot Dot
49.9391809704014
Yes
591
RepliedTo
2/10/2023 8:55:34 PM
Wait... did you even read my post?
I said I discontinued the service during free trial lol, and they have right to charge me anyways after? Like I said, read the post plz!
j81113f
webscraping
FantomHouse
t1_j81113f
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j81113f/
2/10/2023 8:55:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j810gex
t1_j810gex
j810gex
1
10z2cbs
True
False
False
3
16
4
4
2
6.06060606060606
1
3.03030303030303
0
0
10
30.3030303030303
33
128, 128, 128
3.01513718070009
Dash Dot Dot
49.9351263684282
Yes
590
RepliedTo
2/10/2023 8:58:52 PM
A lot of services will charge you even in the free trial if you use premium features or go beyond a certain limit during your trial.
Sometimes there are mistakes made as well. I was charged $100 on a trial with digital ocean, but they returned it when I contacted the support and explained it. You should be able to get it refunded if it's a similar mistake on their end by contacting support.
j811j7p
webscraping
GullibleEngineer4
t1_j811j7p
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j811j7p/
2/10/2023 8:58:52 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j81113f
t1_j81113f
j81113f
1
10z2cbs
False
False
False
4
17
4
4
5
6.75675675675676
3
4.05405405405405
0
0
25
33.7837837837838
74
128, 128, 128
3.01419110690634
Dash Dot Dot
49.9391809704014
Yes
589
RepliedTo
2/10/2023 9:01:05 PM
I see your point but this is totally different case.
I used their free-trial, checked what their service can do, discontinued few miniutes after.
And they still charged me few weeks later charged my card without any authorization, even after I cancelled/closed my account.
I just don't want other to go through this BS.
j811vfr
webscraping
FantomHouse
t1_j811vfr
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j811vfr/
2/10/2023 9:01:05 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j811j7p
t1_j811j7p
j811j7p
1
10z2cbs
True
False
False
5
16
4
4
1
1.78571428571429
2
3.57142857142857
0
0
24
42.8571428571429
56
128, 128, 128
3.01513718070009
Dash Dot Dot
49.9351263684282
Yes
588
RepliedTo
2/10/2023 9:02:53 PM
Looks like a mistake on their end, you should be able to get it refunded by contacting support.
I have had multiple such refunds in past with a few services.
j8125jg
webscraping
GullibleEngineer4
t1_j8125jg
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j8125jg/
2/10/2023 9:02:53 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j811vfr
t1_j811vfr
j811vfr
1
10z2cbs
False
False
False
6
17
4
4
2
6.66666666666667
1
3.33333333333333
0
0
9
30
30
128, 128, 128
3.01419110690634
Dash Dot Dot
49.9391809704014
Yes
587
RepliedTo
2/10/2023 9:05:09 PM
Yeah. I apologize for being somewhat rude, I assumed that you are one of their employees a little bit. I am sorry.
I don't want anyone else to fall for this and waste time..
j812i14
webscraping
FantomHouse
t1_j812i14
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j812i14/
2/10/2023 9:05:09 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j8125jg
t1_j8125jg
j8125jg
1
10z2cbs
True
False
False
7
16
4
4
0
0
4
11.7647058823529
0
0
11
32.3529411764706
34
128, 128, 128
3.01513718070009
Dash Dot Dot
49.9351263684282
Yes
586
RepliedTo
2/10/2023 9:22:26 PM
No problem man. I just wanted to help you resolve your problem by either helping you understand the charges made against your CC or having your money back if the mistake was on their end.
j8155f9
webscraping
GullibleEngineer4
t1_j8155f9
https://www.reddit.com/r/webscraping/comments/10z2cbs/octoparse_is_a_scam_avoid_them/j8155f9/
2/10/2023 9:22:26 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j812i14
t1_j812i14
j812i14
0
10z2cbs
False
False
False
8
17
4
4
1
2.85714285714286
3
8.57142857142857
0
0
13
37.1428571428571
35
128, 128, 128
3
Solid
50
No
585
Posted
11/22/2022 12:04:56 PM
Nearly everyday, billions of individuals from all walks of life and throughout the globe use social media for purchasing, trading, unwinding, reading the headlines, or simply remain connected with people. As a business, it is almost impossible to get anything done without reaching a customer base online. However, in a world of billions, how can you specifically locate only several thousand folks who make up the audience? How do you sift across hundreds of billions of bytes of information to uncover the very few thousand gigabytes that seem to be relevant to your company? Cutting through social media clutter and extracting relevant data through [web scraping services](https://thesentimentai.com/) is a quick, easy, and cost-effective technique. Please feel free to browse the contents list.
[web scraping services](https://preview.redd.it/dzoe51ajsh1a1.png?width=509&format=png&auto=webp&v=enabled&s=e5fa733eccb9add048f6e25b99ccad3082b17eb6)
## Social Media Scraping: The Meaning
It is possible to mechanically collect data through social media sites like Twitter, Facebook, as well as Instagram through a technique known as social media scraping.
Due to the fact that it is an automated procedure carried out by bots, users can conserve time, energy, and occasionally dollars. Although it might require a considerable amount of time, you might spend the time searching the web for every use of a specific word or even to discover every pricing for a specific item.
Although it may seem like a breach of privacy, scraping public data is entirely lawful. One could guarantee that the information has indeed been collected if the profile is visible.
This information comprises all publicly available demographics which are used in marketing, not all of which are restricted to age, race, gender, location, interests, as well as ethnicity. The private details are used by marketers for identifying and targeting specific users.
## Social Media Scraping: The Benefits
It wouldn’t be right if we do not discuss the benefits of social media scraping. For that, consider the following benefits.
* To aid with marketing strategy, conduct sentiment analysis.
* Upgrading public relations strategies and tactics.
* Enhance the development but also business planning procedures.
* Boost participation from the audience.
## Social Media Scraping: The Tools
You can rapidly and effectively scrape information from social media using one of the numerous data scraping APIs available. The best one, though, is what you're after, right? Then stop your search. Numerous tools are available that may automate as well as streamline the [social media scraping](https://thesentimentai.com/#services) procedure, allowing it much simpler to obtain the required data.
[Social media scraping](https://preview.redd.it/tq1p57q4th1a1.png?width=510&format=png&auto=webp&v=enabled&s=2afdc248cd4534358991be1c57ba32d624248374)
1. Sentiment AI
2. Dripify
3. Snov.io
4. Octoparse
5. Import.io
6. Parsehub
7. Leadjet
8. Coolsales
9. Captain Data
10. Pharow
11. ScrapingBee
12. Webscraper.io
13. Zyte
14. Apify
15. ScraperAPI
16. Jarvee
17. Scrab.in
18. Proxycrawl Scraper API
## Social Media Scraping: The Final Rundown
It is well acknowledged that social media is transforming how individuals communicate with one another. Even our perspectives on the environment have changed as a result of it.
It can be difficult for organizations and people to stay on top of changes and satisfy the demands of their followers in the ever-changing [social media landscape](https://thesentimentai.com/#services). Scraping social media sites is a fantastic approach to discover further about the customers plus discover what they want from you.
z1rvlc
u_Thesentimentai
Thesentimentai
t3_z1rvlc
https://www.reddit.com/r/u_Thesentimentai/comments/z1rvlc/what_is_social_media_scraping/
11/22/2022 12:04:56 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What Is Social Media Scraping?
False
1
z1rvlc
0
1
1
1
26
4.4750430292599
4
0.688468158347676
0
0
311
53.5283993115318
581
128, 128, 128
3
Solid
50
No
584
Commented
3/10/2022 4:26:23 PM
They are certainly not the leading solutions in the market; there are plenty like them. Try [Proxycrawl](https://proxycrawl.com/); many of the Fortune 500 companies use and trust it. It's a one-stop shop for all your data extraction needs. Proxycrawl has a reliable API for programmatically requesting websites to extract data from various websites without worrying about restrictions, CAPTCHAs, or blocks.
i04fpoj
webscraping
Plenty-Explorer-9854
t1_i04fpoj
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/i04fpoj/
3/10/2022 4:26:23 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g45iym
t3_g45iym
g45iym
0
g45iym
False
False
False
0
1
6
6
4
6.34920634920635
1
1.58730158730159
0
0
30
47.6190476190476
63
128, 128, 128
3
Solid
50
No
583
Commented
9/8/2020 6:25:50 PM
If you're interested, I built an an API (MLScrape) that works like Diffbot, it uses ML and other techniques to scrape data from any product page. 50 monthly requests for the free plan, and the paid plans start at $15/month for 15k requests, so it's a lot more affordable. There's more information available on the website here: https://www.mlscrape.com/
g4gr7ab
webscraping
buneme
t1_g4gr7ab
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/g4gr7ab/
9/8/2020 6:25:50 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g45iym
t3_g45iym
g45iym
0
g45iym
False
False
False
0
1
6
6
4
6.89655172413793
0
0
0
0
27
46.551724137931
58
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
582
Commented
4/19/2020 5:24:55 PM
Those are pretty different between each other. Depending on the purpose I would suggest one or the other. Can you tell us more about your use case? Programming skills? Budget? Scale of the problem?
fnwey2f
webscraping
k_smith182
t1_fnwey2f
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/fnwey2f/
4/19/2020 5:24:55 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g45iym
t3_g45iym
g45iym
1
g45iym
False
False
False
0
2
6
6
1
2.94117647058824
1
2.94117647058824
0
0
16
47.0588235294118
34
128, 128, 128
3
Solid
50
Yes
581
RepliedTo
4/19/2020 6:15:49 PM
Hi, thanks for you answer.
Use case: scrap e-commerce sites (manufacturers and suppliers) from thousands products to \~1 million. Depends on the manufacturer/supplier.
Programming skills: intermediates (i.e.: familiarized with python).
Budget: 500 - 300 USD/month
Scale: \~50 webs, from 1 thousand to 1 millon items to scrap.
fnwkjyv
webscraping
AndroidePsicokiller
t1_fnwkjyv
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/fnwkjyv/
4/19/2020 6:15:49 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fnwey2f
t1_fnwey2f
fnwey2f
2
g45iym
True
False
False
1
1
6
6
0
0
2
4
0
0
31
62
50
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
580
RepliedTo
4/19/2020 10:14:52 PM
If you are willing to program stuff in Python, go for Scrapy. If you don't have enough resources or time of development I would outsource the solution ([import.io](https://import.io), Octoparse, DiffBot, or... us).
(Ad) At [shalion.com](https://shalion.com), we are experts in ecommerce scraping. Send me a PM if you would like me to provide you with more details as you could outsource the solution with us: we take care of the whole pipeline and display the info in a web app. Although our tool is for ecommerce marketing and product teams, we can skip that and allow you to gather the data using an API or similar.
fnxa3yk
webscraping
k_smith182
t1_fnxa3yk
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/fnxa3yk/
4/19/2020 10:14:52 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fnwkjyv
t1_fnwkjyv
fnwkjyv
0
g45iym
False
False
False
2
2
6
6
2
1.76991150442478
0
0
0
0
50
44.2477876106195
113
128, 128, 128
3
Solid
50
Yes
579
RepliedTo
4/19/2020 8:22:21 PM
What's you're plan to monetize this data set?
I'm doing something similar.
I used scrapy so far, it's been good to me.
fnwy4o6
webscraping
pip_install_Escher
t1_fnwy4o6
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/fnwy4o6/
4/19/2020 8:22:21 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fnwkjyv
t1_fnwkjyv
fnwkjyv
1
g45iym
False
False
False
2
1
6
6
1
4.54545454545455
0
0
0
0
10
45.4545454545455
22
128, 128, 128
3
Solid
50
Yes
578
RepliedTo
4/19/2020 9:28:11 PM
It has to do with the business of the company I work for. Sorry but I cant give more details.
Yes I think that I going to use scrapy. The learnin curve of octoparse is really fast but I couldn't completed a simple task that I do really well with scrapy.
I will try other solutions, but scrapy seems to be the best option so far.
fnx56j1
webscraping
AndroidePsicokiller
t1_fnx56j1
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/fnx56j1/
4/19/2020 9:28:11 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fnwy4o6
t1_fnwy4o6
fnwy4o6
1
g45iym
True
False
False
3
1
6
6
4
6.06060606060606
1
1.51515151515152
0
0
26
39.3939393939394
66
128, 128, 128
3
Solid
50
No
577
RepliedTo
4/20/2020 5:30:49 AM
I think you should also try: [https://prowebscraper.com/](https://prowebscraper.com/)
because they have :
1. Easy to Use GUI (fast learning curve)
2. their [pricing](https://prowebscraper.com/pricing) can accommodate your budget (you can create unlimited scrapers)
3. Most Importantly, you can able to scrape data successfully
fnyedq1
webscraping
hiren_p
t1_fnyedq1
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/fnyedq1/
4/20/2020 5:30:49 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fnx56j1
t1_fnx56j1
fnx56j1
0
g45iym
False
False
False
4
1
6
6
4
8.16326530612245
0
0
0
0
18
36.734693877551
49
128, 128, 128
3
Solid
50
No
576
Commented
4/19/2020 2:11:16 PM
I think [import.io](https://import.io) should be in this list as well. Very nice UI. They are marketing towards larger clients it seems, but there are still freemium offers available. I am biased because I have used the tool for years, but it is gold standard IMO.
fnvv3tv
webscraping
dslakers
t1_fnvv3tv
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/fnvv3tv/
4/19/2020 2:11:16 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g45iym
t3_g45iym
g45iym
0
g45iym
False
False
False
0
1
6
6
4
8.16326530612245
1
2.04081632653061
0
0
21
42.8571428571429
49
128, 128, 128
3
Solid
50
No
575
Posted
5/2/2020 8:47:39 AM
I'm trying to import some data from Airbnb but using IMPORTXML, but I'm getting no data.
From this page, for example, [https://www.airbnb.com/c/ianw6066](https://www.airbnb.com/c/ianw6066?currency=USD) I want to get the following data [https://pasteboard.co/J6tR8f8.png](https://pasteboard.co/J6tR8f8.png) (so 5 texts in total).
For the first one (*Join Airbnb and get up to $44 off your first trip*) I got this xpath `//*[@id='site-content']/div/div/section/section/div[2]/div/div[1]/div` so my formula is `=IMPORTXML(C2,"//*[@id='site-content']/div/div/section/section/div[2]/div/div[1]/div")` but I'm getting the mentioned error.
Then I tried the Octoparse software to check the xpath in there and got this one `//DIV[@class='_m2z73r']`, so I used `=IMPORTXML(C2,"//*DIV[@class='_m2z73r']")` (also without the asterisk) but that didn't work either.
The same goes for the other texts.
Is it possible to import it using IMPORTXML or with any other function?
gc38yd
googlesheets
esty_
t3_gc38yd
https://www.reddit.com/r/googlesheets/comments/gc38yd/importxml_error_na_imported_content_is_empty/
5/2/2020 8:47:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
IMPORTXML Error: N/A Imported Content is Empty
False
1
gc38yd
0
1
9
9
1
0.588235294117647
1
0.588235294117647
0
0
87
51.1764705882353
170
128, 128, 128
3
Solid
50
No
574
Commented
5/2/2020 8:47:40 AM
The most common problem when using IMPORTXML occurs when people try to import from websites that uses scripts to load data. Check out the [quick guide](https://www.reddit.com/r/googlesheets/wiki/import-html-xml) on how you might be able to solve this.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/googlesheets) if you have any questions or concerns.*
fp905p0
googlesheets
AutoModerator
t1_fp905p0
https://www.reddit.com/r/googlesheets/comments/gc38yd/importxml_error_na_imported_content_is_empty/fp905p0/
5/2/2020 8:47:40 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gc38yd
t3_gc38yd
gc38yd
0
gc38yd
False
False
False
0
1
9
9
0
0
2
2.7027027027027
0
0
34
45.9459459459459
74
128, 128, 128
3
Solid
50
No
573
Posted
8/18/2020 12:59:27 AM
Hey y'all,
I am a web coding newbie that knows HTML, CSS, javascript, and react. I am looking to make a side project React app that is a job board clone. Basically gets all web development jobs (results from queries with HTML/CSS/JS) and just displays them.
So far in my research I have encountered numerous posts about people making their own scraping tool with Python. I don't know it at the moment and would like to learn it later (after I have mastered React). I have tried the official API's for indeed and glassdoor and they are either gone or you have to be approved aka be a "real" company. I am currently experimenting with a service called octoparse which sort of works but doesn't return all the information to me at the moment, and I find the documentation a bit confusing.
Does anyone know of a web scraping service similar to Octoparse that is proven to scrape jobs (title, description, date posted) from indeed/glassdoor? Preferably one that has a good free/cheap tier and good documentation? Or is there an API that can get a list of jobs? I tried jooble wasnt able to get it to work.
ibr1qj
webdev
themightykrusher
t3_ibr1qj
https://www.reddit.com/r/webdev/comments/ibr1qj/web_scraping_service_or_api_to_get_jobs_from/
8/18/2020 12:59:27 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping service or api to get jobs from Indeed/Glassdoor etc.
False
1
ibr1qj
0
1
54
54
7
3.46534653465347
2
0.99009900990099
0
0
90
44.5544554455446
202
128, 128, 128
3
Solid
50
No
572
Commented
9/2/2020 7:29:44 PM
I think finddatalab [web scraping service](https://finddatalab.com/) is for you.
In this company, you can very cheaply and with high quality reformat any data from the site into a format convenient for you (Excel, pdf, Image, JSON, FTP, API, etc.) I have repeatedly used the services of this company, it has always done its job perfectly.
g3rai2q
webdev
Leo_Paredes2354
t1_g3rai2q
https://www.reddit.com/r/webdev/comments/ibr1qj/web_scraping_service_or_api_to_get_jobs_from/g3rai2q/
9/2/2020 7:29:44 PM
9/2/2020 8:06:51 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ibr1qj
t3_ibr1qj
ibr1qj
0
ibr1qj
False
False
False
0
1
54
54
3
5.17241379310345
1
1.72413793103448
0
0
27
46.551724137931
58
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
No
571
Posted
11/14/2022 8:10:55 AM
🤩 Octoparse Black Friday
Descuento hasta 40% solo en Nov.16!
[First day](https://preview.redd.it/pg3f090rivz91.jpg?width=1920&format=pjpg&auto=webp&v=enabled&s=ffd7d899d266bffdeabc3d5c7dbe2063a2ddb54f)
\[**Estándar Anual**\] Ahorrar $271 + 3GB créditos de proxies residenciales+ 240 créditos de resolución de CAPTCHA+ $24 créditos de plantillas + crawler personalizado o la capacitación
\[**Profesional Anual**\] Ahorrar $800 + 15GB créditos de proxies residenciales + 600 créditos de resolución de CAPTCHA + $60 créditos de plantillas + crawler personalizado o la capacitación de 120min
👉Más detalles: [Sitio web oficial](https://www.octoparse.com/black-friday-2022?utm_source=ytb&utm_medium=social&utm_campaign=22bf)
yusn5s
u_Octoparse-hola
Octoparse-hola
t3_yusn5s
https://www.reddit.com/r/u_Octoparse-hola/comments/yusn5s/octoparse_black_friday_ya_viene/
11/14/2022 8:10:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
🤩 Octoparse Black Friday ya viene!
False
1
yusn5s
0
9
1
1
0
0
0
0
0
0
72
70.5882352941177
102
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
No
570
Posted
11/18/2022 8:26:59 AM
&#x200B;
[Black Friday Octoparse](https://preview.redd.it/qxib1qnn6o0a1.jpg?width=1920&format=pjpg&auto=webp&v=enabled&s=b9437710818cc9be29fd8dae32c9e3ea30e34be1)
👏 Black Friday 2022 Octoparse
Hasta 30% de descuento cuando actualize el plan premium
\[**Estándar Anual**\] Ahorrar $201 + 3GB créditos de proxies residenciales+ 240 créditos de resolución de CAPTCHA+ $24 créditos de plantillas + crawler personalizado o la capacitación
\[**Profesional Anual**\] Ahorrar $500 + 15GB créditos de proxies residenciales + 600 créditos de resolución de CAPTCHA + $60 créditos de plantillas + crawler personalizado o la capacitación de 120min
👉 Más detalles: https://www.octoparse.com/black-friday-2022?utm\_source=ytb&utm\_medium=social&utm\_campaign=22bf
yyezqt
u_Octoparse-hola
Octoparse-hola
t3_yyezqt
https://www.reddit.com/r/u_Octoparse-hola/comments/yyezqt/black_friday_2022_octoparse/
11/18/2022 8:26:59 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
👏 Black Friday 2022 Octoparse
False
1
yyezqt
0
9
1
1
0
0
0
0
0
0
63
70
90
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
No
569
Posted
11/16/2022 7:58:30 AM
&#x200B;
[Black friday Octoparse](https://preview.redd.it/8omqbyxor90a1.jpg?width=1920&format=pjpg&auto=webp&v=enabled&s=d0dc67a3aff84eabec36458a6c0a6a806396810d)
**💥** Black Friday 2022 Empieza Hoy
Solo en el primer día: hasta 40% de descuento y créditos gratuitos!
\[**Estándar Anual**\] Ahorrar $271 + 3GB créditos de proxies residenciales+ 240 créditos de resolución de CAPTCHA+ $24 créditos de plantillas + crawler personalizado o la capacitación
\[**Profesional Anual**\] Ahorrar $800 + 15GB créditos de proxies residenciales + 600 créditos de resolución de CAPTCHA + $60 créditos de plantillas + crawler personalizado o la capacitación de 120min
👉 Más detalles: https://www.octoparse.com/black-friday-2022?utm\_source=ytb&utm\_medium=social&utm\_campaign=22bf
ywnixk
u_Octoparse-hola
Octoparse-hola
t3_ywnixk
https://www.reddit.com/r/u_Octoparse-hola/comments/ywnixk/black_friday_2022_empieza_hoy/
11/16/2022 7:58:30 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
💥 Black Friday 2022 Empieza Hoy
False
1
ywnixk
0
9
1
1
0
0
0
0
0
0
65
69.1489361702128
94
128, 128, 128
3
Solid
50
No
568
Posted
9/6/2017 5:33:52 PM
I'm kinda tech-dumb so please forgive me if I'm not asking the right questions. I have a project in my head that I'm trying to figure out how to do and I don't know if there is something that already exist commercially or if I need to have it built.
What I want to do is create a copy of a website. Specifically a car dealer's website that shows its inventory on any given day. I've played around with Octoparse and HTTrack but they aren't quite there for me.
I don't want to just scrap data. I want a reproduction of the website as it appears with pictures and all. I'd like it date stamped so I can look back and print any given day's website. And I'd like to have a scheduler so it runs at set intervals, e.g. the first and 15th of the month. I don't need it to copy the internal links ("click here for more pictures") but I do need it to automatically go to page 2, 3, 4 ect.
So what do you think? Does something like this exist or do I need to hire a smart guy to write it for me? And if it is a create your own monster sort of deal, how involved is it? Thanks for any insight.
6yh4vk
Python
cokoop
t3_6yh4vk
https://www.reddit.com/r/Python/comments/6yh4vk/can_i_build_my_own_wayback_machine/
9/6/2017 5:33:52 PM
1/1/0001 12:00:00 AM
False
False
6
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Can I build my own Wayback Machine?
False
0.76
6yh4vk
0
1
27
27
2
0.904977375565611
3
1.35746606334842
0
0
77
34.841628959276
221
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
567
Commented
9/7/2017 12:19:09 AM
https://fosswire.com/post/2008/04/create-a-mirror-of-a-website-with-wget/ + cron gets you most of the way there.
dmnw49t
Python
alanjcastonguay
t1_dmnw49t
https://www.reddit.com/r/Python/comments/6yh4vk/can_i_build_my_own_wayback_machine/dmnw49t/
9/7/2017 12:19:09 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
6yh4vk
t3_6yh4vk
6yh4vk
1
6yh4vk
False
False
False
0
2
27
27
0
0
0
0
0
0
3
37.5
8
128, 128, 128
3
Solid
50
Yes
566
RepliedTo
9/7/2017 1:25:05 AM
Thank you. May I ask "cron"?
dmnz2nk
Python
cokoop
t1_dmnz2nk
https://www.reddit.com/r/Python/comments/6yh4vk/can_i_build_my_own_wayback_machine/dmnz2nk/
9/7/2017 1:25:05 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
dmnw49t
t1_dmnw49t
dmnw49t
1
6yh4vk
True
False
False
1
1
27
27
1
16.6666666666667
0
0
0
0
2
33.3333333333333
6
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
565
RepliedTo
9/7/2017 2:02:52 AM
A scheduler daemon. From chronos; the Greek word for time. See https://en.wikipedia.org/wiki/Cron and your platform's documentation.
dmo0qr9
Python
alanjcastonguay
t1_dmo0qr9
https://www.reddit.com/r/Python/comments/6yh4vk/can_i_build_my_own_wayback_machine/dmo0qr9/
9/7/2017 2:02:52 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
dmnz2nk
t1_dmnz2nk
dmnz2nk
1
6yh4vk
False
False
False
2
2
27
27
0
0
0
0
0
0
9
60
15
128, 128, 128
3
Solid
50
No
564
RepliedTo
9/7/2017 2:02:57 AM
**Cron**
The software utility Cron is a time-based job scheduler in Unix-like computer operating systems. People who set up and maintain software environments use cron to schedule jobs (commands or shell scripts) to run periodically at fixed times, dates, or intervals. It typically automates system maintenance or administration—though its general-purpose nature makes it useful for things like downloading files from the Internet and downloading email at regular intervals. The origin of the name cron is from the Greek word for time, χρόνος (chronos).
***
^[ [^PM](https://www.reddit.com/message/compose?to=kittens_from_space) ^| [^Exclude ^me](https://reddit.com/message/compose?to=WikiTextBot&message=Excludeme&subject=Excludeme) ^| [^Exclude ^from ^subreddit](https://np.reddit.com/r/Python/about/banned) ^| [^FAQ ^/ ^Information](https://np.reddit.com/r/WikiTextBot/wiki/index) ^| [^Source](https://github.com/kittenswolf/WikiTextBot) ^]
^Downvote ^to ^remove ^| ^v0.27
dmo0qvj
Python
WikiTextBot
t1_dmo0qvj
https://www.reddit.com/r/Python/comments/6yh4vk/can_i_build_my_own_wayback_machine/dmo0qvj/
9/7/2017 2:02:57 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
dmo0qr9
t1_dmo0qr9
dmo0qr9
0
6yh4vk
False
False
False
3
1
27
27
1
0.709219858156028
0
0
0
0
90
63.8297872340426
141
128, 128, 128
3
Solid
50
No
563
Posted
6/7/2022 9:47:03 AM
**Qu’est-ce que le webscraping ?**
https://preview.redd.it/uzo0lfdc66491.png?width=498&format=png&auto=webp&v=enabled&s=a2c300bec8df0670e0560a09d17869730d7079d0
Le Webscraping est une technique d’extraction du contenu de sites web via un script ou un programme dans le but de transformer ce contenu pour permettre son utilisation dans un autre contexte.
**Comment le Webscraping peut être utile ?**
&#x200B;
https://preview.redd.it/e0ofo7qk66491.png?width=225&format=png&auto=webp&v=enabled&s=8ccd0792cfed043c0e7497ade552882253e0bd85
**1er cas d’usage :**
Une entreprise souhaite améliorer sa connaissance client, les meilleures sources sont les avis clients.
Prenons l’exemple du site snowleader.fr.
Snowleader est considéré comme la référence en ligne pour les achats de matériel de glisse et d’équipement pour l’outdoor et city.
Snowleader, c’est un catalogue de 20 000 références, plus de 400 marques et plus de 650 000 visiteurs par mois sur leur site français.
Concernant les avis clients, leur site indique qu’ils font confiance à « Trusted shop » pour collecter les avis clients et les certifier (28 952 avis au total). Malgré cet effort de la marque d’essayer de centraliser les avis, Trusted shop ne représente qu’une partie des avis de ce site, en effet, les clients les déposent de manière autonome sur Trustpilot (5 480 avis avec une note de 4,2/5) et sur avis-clients.fr (17 avis), il y a donc 15,9% des avis qui ne sont pas analysés sans compter les réseaux sociaux et c’est à ce moment-là que le Webscraping est utile.
Le webscraping va pouvoir récupérer les données de ce site de notation, agrémenter une base de données pour l’analyser en quelques secondes. Il est même possible de mettre en place des requêtes pour automatiser le scraping.
Retrouvez ma vidéo présentant une analyse d'avis client sur ce lien :
&#x200B;
https://reddit.com/link/v6rbyk/video/ig5257x576491/player
&#x200B;
**2ème cas d’usage :**
La veille concurrentielle.
Darty, entreprise créée en 1957, est devenue au fil des années le leader de l’électroménager, image et son avec son fameux contrat de confiance.
D’ailleurs, le contrat de confiance permet aux clients de la marque, de se faire rembourser de la différence de prix dans le cas où ils trouveraient moins chez ailleurs dans les 6 mois après l’achat.
Les concurrents éligibles au remboursement de la différence sont :
Amazon, Apple Store, Auchan, BHV, Boulanger, BUT, Carrefour, Casino, Castorama, Cdiscount, Cobra, Conforama, Cora, Digital, Electro Dépôt, Expert, Fnac, Galeries Lafayette, Géant Casino, Gitem, Grosbill, Intermarché, La Redoute, LDLC, Leclerc, Leroy Merlin, Magma, MDA, Pixmania, Printemps, Pro & Cie, Pulsat, Rue du Commerce, U Hyper-Supermarchés, Ubaldi, webdistrib
A ce titre, Darty souhaite effectuer une veille concurrentielle concernant le prix de toutes ses références en magasin.
C’est un travail de titan de faire les recherches à la main et le webscraping va permettre de rechercher chaque référence de produit vendu chez Darty et de récupérer le prix afin d’alimenter une base de données. Cette opération ne prend que quelques minutes à un robot scrapeur.
**Le scraping est-il légal ?**
Le webscraping est encore en zone grise car le sujet est complexe. Ce que l’on peut dire, c’est qu’à partir du moment où les données sont scrapées par un tiers et republiées sans modification, ça peut être considéré comme répréhensible et considéré comme une violation de la propriété intellectuelle. (Source : [https://islean-consulting.fr/fr/transformation-digitale/scraping-pages-web-legal/](https://islean-consulting.fr/fr/transformation-digitale/scraping-pages-web-legal/))
Dans nos 2 exemples, le webscaping est visiblement légal.
**Comment savoir si un site accepte de se faire scraper ?**
Prenons l’exemple de DARTY, vous tapez dans l’URL [https://www.darty.com/](https://www.darty.com/) et vous ajoutez : robots.txt
Voici le résultat :
&#x200B;
https://preview.redd.it/1unu53t266491.png?width=798&format=png&auto=webp&v=enabled&s=3f9a820605a0a1d7c95a48dea682f487d6464510
Dans notre cas, Disallow est indiqué, Darty bloque le scraping.
Si Allow est indiqué, le site accepte de se faire scraper.
**Comment scraper ?**
Plusieurs outils possibles :
En no-code : Octoparse, ParseHub ou Webscraper.io.
Ce sont des outils payants mais facile d’utilisation en revanche, BeautifulSoup ou Scrapy sont des librairies Python extrêmement flexibles et gratuites mais qui demandent une connaissance en code.
**Pour résumer** :
Le scraping est très utile, il fait gagner beaucoup de temps et rend des services immenses. Les cas d’usages sont infinis mais il faut respecter certains principes afin de rester dans la légalité.
J’espère que cet article vous aura été utile et que vous aurez appris sur le webscraping.
v6rbyk
u_yannick_data
yannick_data
t3_v6rbyk
https://www.reddit.com/r/u_yannick_data/comments/v6rbyk/tout_savoir_sur_le_webcraping/
6/7/2022 9:47:03 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Tout savoir sur le Webcraping
False
1
v6rbyk
0
1
1
1
2
0.278551532033426
4
0.557103064066852
0
0
438
61.0027855153203
718
128, 128, 128
3
Solid
50
No
560
Commented
3/24/2021 2:35:54 PM
You can use snscrape library with this library you can get more then 1.000.000 Tweets in a day from Twitter with Python
Here is my udemy course about this
\* Get more then 1.000.000 Tweets from Twitter with Python
\* Send messages to Telegram Application with Python
\* Download playlists from YouTube with Python
\* Make Twitter Bot with \*\*Python\*\*
https://udemy.com/course/python-with-twitter-telegram-and-youtube/?referralCode=359DD24443ECF17E6B16
gs1wi0i
learnprogramming
hak122hak
t1_gs1wi0i
https://www.reddit.com/r/learnprogramming/comments/m9429a/what_is_the_best_way_to_gather_a_database_of/gs1wi0i/
3/24/2021 2:35:54 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
m9429a
t3_m9429a
m9429a
0
m9429a
False
False
False
0
1
17
17
0
0
0
0
0
0
32
53.3333333333333
60
128, 128, 128
3
Solid
50
Yes
559
Commented
3/20/2021 10:15:34 AM
If you know JavaScript you can simply AJAX that data in on page load or use Node for your backend.
Otherwise, if you want minimal programming you’ll need to find a CMS you like and see what Twitter widgets are available for it. Those should have some sort of keyword search I would think.
grkrw57
learnprogramming
Dondontootles
t1_grkrw57
https://www.reddit.com/r/learnprogramming/comments/m9429a/what_is_the_best_way_to_gather_a_database_of/grkrw57/
3/20/2021 10:15:34 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
m9429a
t3_m9429a
m9429a
1
m9429a
False
False
False
0
1
17
17
1
1.81818181818182
0
0
0
0
24
43.6363636363636
55
128, 128, 128
3
Solid
50
Yes
558
RepliedTo
3/20/2021 2:04:54 PM
Thanks for the advice!
grl8xlx
learnprogramming
elles_bells_
t1_grl8xlx
https://www.reddit.com/r/learnprogramming/comments/m9429a/what_is_the_best_way_to_gather_a_database_of/grl8xlx/
3/20/2021 2:04:54 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
grkrw57
t1_grkrw57
grkrw57
0
m9429a
True
False
False
1
1
17
17
0
0
0
0
0
0
2
50
4
128, 128, 128
3
Solid
50
No
557
Commented
6/27/2022 4:01:02 PM
View in your timezone:
[TONIGHT at 11:59 pm ET][0]
[0]: https://timee.io/20220628T0359?tl=%E2%8C%9B%20Summer%20Sale%202022%20Ends%20Today!
idxyssv
u_Octoparseideas
timee_bot
t1_idxyssv
https://www.reddit.com/r/u_Octoparseideas/comments/vlytzi/summer_sale_2022_ends_today/idxyssv/
6/27/2022 4:01:02 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
vlytzi
t3_vlytzi
vlytzi
0
vlytzi
False
False
False
0
1
3
3
0
0
0
0
0
0
6
50
12
128, 128, 128
3
Solid
50
Yes
547
Commented
4/17/2021 5:09:17 AM
I've been web-scraping/scripting for 20 years now.. When I started there wasn't any pre-made tools that didn't suck, so I started writing my own.
These days there seems to be python libraries and other stuff to help, but so far, I haven't found anything that beats my in home library (in terms of both ease of coding and power). So I'd recommend rolling your own.
That being said, before this post I hadn't even heard of "octoparse". Do people really pay money for stuff like that??? Makes me think perhaps I should sell my toolset...
gut5x9j
webscraping
DogmaticAmbivalence
t1_gut5x9j
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/gut5x9j/
4/17/2021 5:09:17 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ms7agw
t3_ms7agw
ms7agw
2
ms7agw
False
False
False
0
1
2
2
2
2.04081632653061
1
1.02040816326531
0
0
44
44.8979591836735
98
128, 128, 128
3
Solid
50
Yes
546
RepliedTo
4/17/2021 6:54:35 AM
Awesome! I’m only planning to try the free plan. But I suppose people pay for it because these days, it can be annoying to always have to update your programme to beat the new anti-scraping measures. I suppose Octoparse and it’s alternatives have a team of people working on that. But a university student such as myself can never afford the prices they charge :(
gutdqj2
webscraping
larfleeeze
t1_gutdqj2
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/gutdqj2/
4/17/2021 6:54:35 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gut5x9j
t1_gut5x9j
gut5x9j
0
ms7agw
True
False
False
1
1
2
2
3
4.47761194029851
1
1.49253731343284
0
0
27
40.2985074626866
67
128, 128, 128
3
Solid
50
No
545
RepliedTo
4/17/2021 6:19:46 AM
Yes people paid for it, I’ve been web scraping for a couple of years as well, just wondering what’s the capability of your toolset?
gutb9vv
webscraping
prof_happy
t1_gutb9vv
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/gutb9vv/
4/17/2021 6:19:46 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gut5x9j
t1_gut5x9j
gut5x9j
0
ms7agw
False
False
False
1
1
2
2
2
7.69230769230769
0
0
0
0
10
38.4615384615385
26
128, 128, 128
3
Solid
50
Yes
544
Commented
4/17/2021 1:47:08 AM
I came across the similar situation before. What I did was to use the existing tool, until I found the in-adequate part in that tool. After that, I started to learn all those things like bs, selenium, and eventually chose scrapy. That way, I knew what I really need for scraping and paid more attention to those when I learned the coding. Just my two cents...
gusm7ux
webscraping
McLukeJ
t1_gusm7ux
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/gusm7ux/
4/17/2021 1:47:08 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ms7agw
t3_ms7agw
ms7agw
1
ms7agw
False
False
False
0
1
2
2
1
1.49253731343284
1
1.49253731343284
0
0
32
47.7611940298507
67
128, 128, 128
3
Solid
50
Yes
543
RepliedTo
4/17/2021 3:51:26 AM
Fantastic! Your advice makes a lot of sense to me. Will heed it. Thanks!
gusz3x6
webscraping
larfleeeze
t1_gusz3x6
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/gusz3x6/
4/17/2021 3:51:26 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gusm7ux
t1_gusm7ux
gusm7ux
0
ms7agw
True
False
False
1
1
2
2
1
7.14285714285714
0
0
0
0
6
42.8571428571429
14
128, 128, 128
3
Solid
50
Yes
542
Commented
4/16/2021 5:02:55 PM
If it's a one-off need. Use a tool.
If this is just the beginning, build your own.
guqtajk
webscraping
jcrowe
t1_guqtajk
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/guqtajk/
4/16/2021 5:02:55 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ms7agw
t3_ms7agw
ms7agw
1
ms7agw
False
False
False
0
1
2
2
0
0
0
0
0
0
6
33.3333333333333
18
128, 128, 128
3
Solid
50
Yes
541
RepliedTo
4/17/2021 3:50:45 AM
Thanks a lot! I will consider using Octoparse since it’s free to try (until 10,000 data points scraped or smth I think). But perhaps in the long term I will build my own
gusz1ia
webscraping
larfleeeze
t1_gusz1ia
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/gusz1ia/
4/17/2021 3:50:45 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
guqtajk
t1_guqtajk
guqtajk
0
ms7agw
True
False
False
1
1
2
2
1
2.85714285714286
0
0
0
0
18
51.4285714285714
35
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
556
Posted
4/16/2021 4:49:24 PM
I am looking to scrap popular sites such as Booking.com and I discovered softwares such as Octoparse that already comes with the infrastructure to scrap such websites.
Should I consider Octoparse (or any other alternative softwares) or should I start my own from scratch? I have some basic knowledge of Python, beautiful soup, selenium etc
Let me know if any of you had experience with softwares like Octoparse. Thanks a lot :)
ms7agw
webscraping
larfleeeze
t3_ms7agw
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/
4/16/2021 4:49:24 PM
1/1/0001 12:00:00 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Build your own or use ready-made tools?
False
0.75
ms7agw
0
2
2
2
2
2.77777777777778
3
4.16666666666667
0
0
30
41.6666666666667
72
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
555
Commented
4/17/2021 10:16:14 AM
Thanks you so much! I will look into this for sure.
guttjci
webscraping
larfleeeze
t1_guttjci
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/guttjci/
4/17/2021 10:16:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ms7agw
t3_ms7agw
ms7agw
0
ms7agw
True
False
False
0
2
2
2
0
0
0
0
0
0
4
36.3636363636364
11
128, 128, 128
3
Solid
50
No
554
Commented
4/20/2021 12:57:59 PM
Hi There,
If you are looking for an option to scrape sites like booking.com, trivago, expedia or any other, for example, tools such as Octaparse and similar are available in the market. These are really helpful, especially for those who either lack the technical know-how or one who is good with a programming language. But the major issues with these tools remains i) scraper getting blocked ii) legality.
If your requirements are basic and do not involve regular scraping, tools could be the best way to go forward. But if your requirements are customised and you need to scrape data regularly, then a web scraping service could be a better way to go. A data scraping service provider is more equipped to handle more complex, customised and huge scraping requirement because of its:
* Experience in different domain
* Infrastructure
* Established workflow
* Team
If you would like to know more about the difference between a web scraping service and a tool, here is a link to help you with it.
Link:[ https://www.promptcloud.com/blog/web-scraping-tool-vs-web-scraping-services/](https://www.promptcloud.com/blog/web-scraping-tool-vs-web-scraping-services/).
Hope this helps.
gv7319k
webscraping
promptcloud
t1_gv7319k
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/gv7319k/
4/20/2021 12:57:59 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ms7agw
t3_ms7agw
ms7agw
0
ms7agw
False
False
False
0
1
2
2
5
2.8735632183908
3
1.72413793103448
0
0
82
47.1264367816092
174
128, 128, 128
3
Solid
50
No
553
Commented
4/19/2021 9:44:49 AM
For ready-made tools - and specifically for booking.com I recommend the [Data Collector](https://brightdata.grsm.io/vitariz-dca). It has a working template all you need to do is choose where and how to get the data.
gv2bcii
webscraping
Gidoneli
t1_gv2bcii
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/gv2bcii/
4/19/2021 9:44:49 AM
12/27/2022 8:24:20 PM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ms7agw
t3_ms7agw
ms7agw
0
ms7agw
False
False
False
0
1
2
2
2
5.12820512820513
0
0
0
0
16
41.025641025641
39
128, 128, 128
3
Solid
50
No
552
Commented
4/17/2021 10:08:15 AM
I also scrape [booking.com](https://booking.com) and I found this tool which comes with pre built scrapers for [booking.com](https://booking.com), in a couple minutes I was able to get data with the free version. See the links below
[https://webautomation.io/pde/booking-search-page-extractor/87/](https://webautomation.io/pde/booking-search-page-extractor/87/)
[https://webautomation.io/pde/booking-room-price-extractor/90/](https://webautomation.io/pde/booking-room-price-extractor/90/)
Let me know if it was useful
gutslr7
webscraping
VictorAVB
t1_gutslr7
https://www.reddit.com/r/webscraping/comments/ms7agw/build_your_own_or_use_readymade_tools/gutslr7/
4/17/2021 10:08:15 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ms7agw
t3_ms7agw
ms7agw
0
ms7agw
False
False
False
0
1
2
2
2
2.32558139534884
0
0
0
0
49
56.9767441860465
86
128, 128, 128
3
Solid
50
No
537
Posted
11/19/2019 11:25:28 AM
[removed]
dyj6qz
content_marketing
fOOyili
t3_dyj6qz
https://www.reddit.com/r/content_marketing/comments/dyj6qz/octoparse_task_templates_get_data_without_coding/
11/19/2019 11:25:28 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse Task Templates: get data without coding
False
1
dyj6qz
0
1
9
9
0
0
0
0
0
0
1
100
1
128, 128, 128
3
Solid
50
No
536
Commented
11/19/2019 11:25:28 AM
Your submission has been automatically removed because you are new to our community, spend more time contributing with your comments before posting.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/content_marketing) if you have any questions or concerns.*
f81d8ph
content_marketing
AutoModerator
t1_f81d8ph
https://www.reddit.com/r/content_marketing/comments/dyj6qz/octoparse_task_templates_get_data_without_coding/f81d8ph/
11/19/2019 11:25:28 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
dyj6qz
t3_dyj6qz
dyj6qz
0
dyj6qz
False
False
False
0
1
9
9
0
0
1
1.96078431372549
0
0
23
45.0980392156863
51
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
535
Posted
12/7/2022 10:11:20 PM
I have an Octoparse task I'm working on and the website is a bit frustrating. I'd love some input on how you might get around this.
I need to get product details that are displayed on a modal window. That modal takes between 3-15 seconds to display. If Octoparse tries to scrape data before the data (not modal) fully loads, the results are placeholders, like "{item\_number}".
I've tried having it wait for an element to be visible, but the problem is that all elements are technically visible (though not to the human eye) - they just have placeholders. So, it scrapes the placeholders.
Worst case scenario, I can set the wait time to extract to 30s, that seems to capture them all, but it's going to take forever. There are about 20k records I need to gather.
The website is behind a client's login, so I can't share it. Hopefully I've explained it clearly.
Element before the data loads:
`<span class="dynamic_field_item_id">{item_number}</span>`
Element after the data loads:
`<span class="dynamic_field_item_id">52726915</span>`
I will appreciate all of your ideas. :)
zffqif
webscraping
IAmEarthGirl
t3_zffqif
https://www.reddit.com/r/webscraping/comments/zffqif/avoiding_placeholders_in_dynamic_data/
12/7/2022 10:11:20 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Avoiding Placeholders in Dynamic Data
False
1
zffqif
0
2
1
1
3
1.57894736842105
3
1.57894736842105
0
0
94
49.4736842105263
190
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
534
Commented
12/9/2022 2:39:18 AM
I figured this out, I just compared the two pages until I finally found a div that wasn't displaying before the data populated.
izhbgek
webscraping
IAmEarthGirl
t1_izhbgek
https://www.reddit.com/r/webscraping/comments/zffqif/avoiding_placeholders_in_dynamic_data/izhbgek/
12/9/2022 2:39:18 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zffqif
t3_zffqif
zffqif
0
zffqif
True
False
False
0
2
1
1
0
0
0
0
0
0
11
47.8260869565217
23
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
638
Posted
8/13/2021 2:30:50 AM
Hi everyone!
This is my [target page](https://produto.mercadolivre.com.br/MLB-1908028290-fidget-toys-hand-spinner-anti-stress-pop-it-bolha-colorido-_JM?attributes=COLOR_SECONDARY_COLOR%3ARGlub3NzYXVybyBDb2xvcmlkbw%3D%3D&quantity=1).
I am trying to extract all images (which sometimes are more numerous than can be displayed so the bottom-most image is replaced with a +X button where X is the number of additional photos) and the thing is that **you can click on "Cor" on the right side and choose a variation, which completely changes all of the photos**.
First of all, how can I extract all of the photos for a given product variation, including the hidden ones? Using the normal "list" functionality doesn't seem to work and I am only able to extract the ones that are visible.
Secondly, how can I flip through each variation so that I can extract their respective images as well?
Thanks a lot!
p3eiei
webscraping
JoZeHgS
t3_p3eiei
https://www.reddit.com/r/webscraping/comments/p3eiei/could_i_have_some_help_with_octoparse/
8/13/2021 2:30:50 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Could I have some help with Octoparse?
False
0.67
p3eiei
0
169
4
4
3
1.93548387096774
2
1.29032258064516
0
0
73
47.0967741935484
155
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
637
Posted
8/13/2021 2:30:50 AM
Hi everyone!
This is my [target page](https://produto.mercadolivre.com.br/MLB-1908028290-fidget-toys-hand-spinner-anti-stress-pop-it-bolha-colorido-_JM?attributes=COLOR_SECONDARY_COLOR%3ARGlub3NzYXVybyBDb2xvcmlkbw%3D%3D&quantity=1).
I am trying to extract all images (which sometimes are more numerous than can be displayed so the bottom-most image is replaced with a +X button where X is the number of additional photos) and the thing is that **you can click on "Cor" on the right side and choose a variation, which completely changes all of the photos**.
First of all, how can I extract all of the photos for a given product variation, including the hidden ones? Using the normal "list" functionality doesn't seem to work and I am only able to extract the ones that are visible.
Secondly, how can I flip through each variation so that I can extract their respective images as well?
Thanks a lot!
p3eiei
webscraping
JoZeHgS
t3_p3eiei
https://www.reddit.com/r/webscraping/comments/p3eiei/could_i_have_some_help_with_octoparse/
8/13/2021 2:30:50 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Could I have some help with Octoparse?
False
0.67
p3eiei
0
169
4
4
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
636
Posted
8/16/2021 2:51:54 AM
Hi everyone!
I would like to automate the "search by image" process on [aliseeks.com](https://aliseeks.com). I have thousands of images that I would like to search for there and all I am interested in is getting the URL of the page that follows the successful upload so that I can use Octoparse to scrape info from it.
I know Python but have no experience using it for web programming. What is the best way to accomplish this? Is there ready made software that I could use or would I have to implement something myself?
Thanks a lot!
p57hx4
webscraping
JoZeHgS
t3_p57hx4
https://www.reddit.com/r/webscraping/comments/p57hx4/how_can_i_achieve_this/
8/16/2021 2:51:54 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How can I achieve this?
False
0.33
p57hx4
0
169
4
4
4
4
0
0
0
0
35
35
100
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
635
Posted
8/21/2021 11:10:18 AM
Hi everyone
I have scraped with Octoparse a few times and I just started learning Python scraping and I have a question. Just how much faster is scraping with Python vs Octoparse? Is Python the fastest way of all?
Thanks a lot!
p8plww
webscraping
JoZeHgS
t3_p8plww
https://www.reddit.com/r/webscraping/comments/p8plww/how_does_pure_python_code_compare_to_octoparse_in/
8/21/2021 11:10:18 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How does pure Python code compare to Octoparse in terms of speed?
False
0.5
p8plww
0
169
4
4
2
4.76190476190476
0
0
0
0
20
47.6190476190476
42
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
634
Posted
8/21/2021 1:34:35 AM
[removed]
p8ifqy
webdev
JoZeHgS
t3_p8ifqy
https://www.reddit.com/r/webdev/comments/p8ifqy/how_much_faster_is_web_scraping_with_python_vs/
8/21/2021 1:34:35 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How much faster is web scraping with Python vs with Octoparse?
False
1
p8ifqy
0
169
4
4
0
0
0
0
0
0
1
100
1
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
633
Posted
8/20/2021 11:25:40 AM
Hi everyone!
How much faster is scraping with Python vs Octoparse? Is Python the fastest way of all?
Thanks a lot!
p83a2s
webscraping
JoZeHgS
t3_p83a2s
https://www.reddit.com/r/webscraping/comments/p83a2s/how_fast_is_octoparse_compared_to_web_scraping/
8/20/2021 11:25:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How fast is Octoparse compared to web scraping with Python?
False
1
p83a2s
0
169
4
4
2
9.52380952380952
0
0
0
0
11
52.3809523809524
21
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
632
Posted
8/21/2021 11:10:18 AM
Hi everyone
I have scraped with Octoparse a few times and I just started learning Python scraping and I have a question. Just how much faster is scraping with Python vs Octoparse? Is Python the fastest way of all?
Thanks a lot!
p8plww
webscraping
JoZeHgS
t3_p8plww
https://www.reddit.com/r/webscraping/comments/p8plww/how_does_pure_python_code_compare_to_octoparse_in/
8/21/2021 11:10:18 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How does pure Python code compare to Octoparse in terms of speed?
False
0.5
p8plww
0
169
4
4
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
631
Posted
8/21/2021 1:34:35 AM
[removed]
p8ifqy
webdev
JoZeHgS
t3_p8ifqy
https://www.reddit.com/r/webdev/comments/p8ifqy/how_much_faster_is_web_scraping_with_python_vs/
8/21/2021 1:34:35 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How much faster is web scraping with Python vs with Octoparse?
False
1
p8ifqy
0
169
4
4
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
630
Posted
8/21/2021 11:10:18 AM
Hi everyone
I have scraped with Octoparse a few times and I just started learning Python scraping and I have a question. Just how much faster is scraping with Python vs Octoparse? Is Python the fastest way of all?
Thanks a lot!
p8plww
webscraping
JoZeHgS
t3_p8plww
https://www.reddit.com/r/webscraping/comments/p8plww/how_does_pure_python_code_compare_to_octoparse_in/
8/21/2021 11:10:18 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How does pure Python code compare to Octoparse in terms of speed?
False
0.5
p8plww
0
169
4
4
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
629
Posted
8/21/2021 1:34:35 AM
[removed]
p8ifqy
webdev
JoZeHgS
t3_p8ifqy
https://www.reddit.com/r/webdev/comments/p8ifqy/how_much_faster_is_web_scraping_with_python_vs/
8/21/2021 1:34:35 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How much faster is web scraping with Python vs with Octoparse?
False
1
p8ifqy
0
169
4
4
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
628
Posted
8/16/2021 2:51:54 AM
Hi everyone!
I would like to automate the "search by image" process on [aliseeks.com](https://aliseeks.com). I have thousands of images that I would like to search for there and all I am interested in is getting the URL of the page that follows the successful upload so that I can use Octoparse to scrape info from it.
I know Python but have no experience using it for web programming. What is the best way to accomplish this? Is there ready made software that I could use or would I have to implement something myself?
Thanks a lot!
p57hx4
webscraping
JoZeHgS
t3_p57hx4
https://www.reddit.com/r/webscraping/comments/p57hx4/how_can_i_achieve_this/
8/16/2021 2:51:54 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How can I achieve this?
False
0.33
p57hx4
0
169
4
4
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
627
Posted
8/20/2021 11:25:40 AM
Hi everyone!
How much faster is scraping with Python vs Octoparse? Is Python the fastest way of all?
Thanks a lot!
p83a2s
webscraping
JoZeHgS
t3_p83a2s
https://www.reddit.com/r/webscraping/comments/p83a2s/how_fast_is_octoparse_compared_to_web_scraping/
8/20/2021 11:25:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How fast is Octoparse compared to web scraping with Python?
False
1
p83a2s
0
169
4
4
128, 128, 128
3.15894039735099
Dash Dot Dot
49.3188268684957
No
626
Posted
8/20/2021 11:25:40 AM
Hi everyone!
How much faster is scraping with Python vs Octoparse? Is Python the fastest way of all?
Thanks a lot!
p83a2s
webscraping
JoZeHgS
t3_p83a2s
https://www.reddit.com/r/webscraping/comments/p83a2s/how_fast_is_octoparse_compared_to_web_scraping/
8/20/2021 11:25:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How fast is Octoparse compared to web scraping with Python?
False
1
p83a2s
0
169
4
4
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
533
Commented
8/21/2021 5:58:10 PM
The bottleneck is not programming language. It's the network latency and data transfer.
h9tauqv
webscraping
RobSm
t1_h9tauqv
https://www.reddit.com/r/webscraping/comments/p8plww/how_does_pure_python_code_compare_to_octoparse_in/h9tauqv/
8/21/2021 5:58:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p8plww
t3_p8plww
p8plww
1
p8plww
False
False
False
0
9
4
4
0
0
1
7.69230769230769
0
0
6
46.1538461538462
13
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
532
RepliedTo
8/21/2021 6:43:37 PM
I see! Ok, thank you very much
h9tguws
webscraping
JoZeHgS
t1_h9tguws
https://www.reddit.com/r/webscraping/comments/p8plww/how_does_pure_python_code_compare_to_octoparse_in/h9tguws/
8/21/2021 6:43:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9tauqv
t1_h9tauqv
h9tauqv
0
p8plww
True
False
False
1
9
4
4
1
14.2857142857143
0
0
0
0
4
57.1428571428571
7
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
531
Commented
8/21/2021 5:58:10 PM
The bottleneck is not programming language. It's the network latency and data transfer.
h9tauqv
webscraping
RobSm
t1_h9tauqv
https://www.reddit.com/r/webscraping/comments/p8plww/how_does_pure_python_code_compare_to_octoparse_in/h9tauqv/
8/21/2021 5:58:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p8plww
t3_p8plww
p8plww
1
p8plww
False
False
False
0
9
4
4
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
530
RepliedTo
8/21/2021 6:43:37 PM
I see! Ok, thank you very much
h9tguws
webscraping
JoZeHgS
t1_h9tguws
https://www.reddit.com/r/webscraping/comments/p8plww/how_does_pure_python_code_compare_to_octoparse_in/h9tguws/
8/21/2021 6:43:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9tauqv
t1_h9tauqv
h9tauqv
0
p8plww
True
False
False
1
9
4
4
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
529
Commented
8/21/2021 5:58:10 PM
The bottleneck is not programming language. It's the network latency and data transfer.
h9tauqv
webscraping
RobSm
t1_h9tauqv
https://www.reddit.com/r/webscraping/comments/p8plww/how_does_pure_python_code_compare_to_octoparse_in/h9tauqv/
8/21/2021 5:58:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
p8plww
t3_p8plww
p8plww
1
p8plww
False
False
False
0
9
4
4
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
528
RepliedTo
8/21/2021 6:43:37 PM
I see! Ok, thank you very much
h9tguws
webscraping
JoZeHgS
t1_h9tguws
https://www.reddit.com/r/webscraping/comments/p8plww/how_does_pure_python_code_compare_to_octoparse_in/h9tguws/
8/21/2021 6:43:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h9tauqv
t1_h9tauqv
h9tauqv
0
p8plww
True
False
False
1
9
4
4
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
527
RepliedTo
6/4/2017 6:52:39 PM
I tried to turn your GitHub links into [permanent links](https://help.github.com/articles/getting-permanent-links-to-files/) ([press **"y"**](https://help.github.com/articles/getting-permanent-links-to-files/#press-y-to-permalink-to-a-file-in-a-specific-commit) to do this yourself):
* [youtube/api-samples/.../**python** (master → 875d380)](https://github.com/youtube/api-samples/tree/875d380396ba5771322f7d1bf678bbf42b63ecc6/python)
----
^(Shoot me a PM if you think I'm doing something wrong.)^( To delete this, click) [^here](https://www.reddit.com/message/compose/?to=GitHubPermalinkBot&subject=deletion&message=Delete reply digc8qk.)^.
digc8qk
Python
GitHubPermalinkBot
t1_digc8qk
https://www.reddit.com/r/Python/comments/6f3y6s/web_scrape_youtube/digc8qk/
6/4/2017 6:52:39 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
digc71z
t1_digc71z
digc71z
0
6f3y6s
False
False
False
1
4
27
27
1
1.08695652173913
1
1.08695652173913
0
0
56
60.8695652173913
92
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
526
RepliedTo
6/4/2017 6:52:39 PM
I tried to turn your GitHub links into [permanent links](https://help.github.com/articles/getting-permanent-links-to-files/) ([press **"y"**](https://help.github.com/articles/getting-permanent-links-to-files/#press-y-to-permalink-to-a-file-in-a-specific-commit) to do this yourself):
* [youtube/api-samples/.../**python** (master → 875d380)](https://github.com/youtube/api-samples/tree/875d380396ba5771322f7d1bf678bbf42b63ecc6/python)
----
^(Shoot me a PM if you think I'm doing something wrong.)^( To delete this, click) [^here](https://www.reddit.com/message/compose/?to=GitHubPermalinkBot&subject=deletion&message=Delete reply digc8qk.)^.
digc8qk
Python
GitHubPermalinkBot
t1_digc8qk
https://www.reddit.com/r/Python/comments/6f3y6s/web_scrape_youtube/digc8qk/
6/4/2017 6:52:39 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
digc71z
t1_digc71z
digc71z
0
6f3y6s
False
False
False
1
4
27
27
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
525
Commented
11/24/2021 10:33:27 PM
Octoparse is not web based (you have to install desktop software) and thus completely unusable for the use case described.
hlyka14
u_Octoparseideas
bartoncls
t1_hlyka14
https://www.reddit.com/r/u_Octoparseideas/comments/pps9yr/how_to_develop_and_grow_your_niche_job_board/hlyka14/
11/24/2021 10:33:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pps9yr
t3_pps9yr
pps9yr
1
pps9yr
False
False
False
0
5
3
3
0
0
1
5
0
0
11
55
20
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
524
RepliedTo
11/25/2021 2:03:15 AM
Hi, users of Octoparse can create their own scrapers using the software, and the use case is available.
hlzcl4p
u_Octoparseideas
Octoparseideas
t1_hlzcl4p
https://www.reddit.com/r/u_Octoparseideas/comments/pps9yr/how_to_develop_and_grow_your_niche_job_board/hlzcl4p/
11/25/2021 2:03:15 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hlyka14
t1_hlyka14
hlyka14
1
pps9yr
True
False
False
1
9
3
3
1
5.55555555555556
0
0
0
0
9
50
18
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
523
RepliedTo
11/25/2021 2:06:48 AM
I wouldn't call this "automated" in the sense that things run in the cloud. It still requires manual fiddling with desktop software.
hlzd21n
u_Octoparseideas
bartoncls
t1_hlzd21n
https://www.reddit.com/r/u_Octoparseideas/comments/pps9yr/how_to_develop_and_grow_your_niche_job_board/hlzd21n/
11/25/2021 2:06:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hlzcl4p
t1_hlzcl4p
hlzcl4p
1
pps9yr
False
False
False
2
5
3
3
0
0
1
4.54545454545455
0
0
11
50
22
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
522
RepliedTo
11/25/2021 6:55:56 AM
Well, after the task is scheduled for cloud running, the scraper created just works on its own and the users can extract the data they need then.
hm0a67f
u_Octoparseideas
Octoparseideas
t1_hm0a67f
https://www.reddit.com/r/u_Octoparseideas/comments/pps9yr/how_to_develop_and_grow_your_niche_job_board/hm0a67f/
11/25/2021 6:55:56 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hlzd21n
t1_hlzd21n
hlzd21n
1
pps9yr
True
False
False
3
9
3
3
2
7.40740740740741
1
3.7037037037037
0
0
9
33.3333333333333
27
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
521
RepliedTo
11/25/2021 9:12:15 PM
So confusing, any video demonstrating all this?
hm2vdg9
u_Octoparseideas
bartoncls
t1_hm2vdg9
https://www.reddit.com/r/u_Octoparseideas/comments/pps9yr/how_to_develop_and_grow_your_niche_job_board/hm2vdg9/
11/25/2021 9:12:15 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hm0a67f
t1_hm0a67f
hm0a67f
1
pps9yr
False
False
False
4
5
3
3
0
0
1
14.2857142857143
0
0
2
28.5714285714286
7
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
520
RepliedTo
11/26/2021 1:32:32 AM
https://youtu.be/xBBAD407zzU
hm3qhba
u_Octoparseideas
Octoparseideas
t1_hm3qhba
https://www.reddit.com/r/u_Octoparseideas/comments/pps9yr/how_to_develop_and_grow_your_niche_job_board/hm3qhba/
11/26/2021 1:32:32 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hm2vdg9
t1_hm2vdg9
hm2vdg9
0
pps9yr
True
False
False
5
9
3
3
0
0
0
0
0
0
0
0
0
128, 128, 128
3
Solid
50
No
519
Posted
10/25/2020 8:39:29 PM
If i wanted the url to the full size images that uploaded by gallery which would be the better option. Basically you have results page with 20ish thumbnails for different galleries that once clicked opens/goes to a new page with 2-25 images that are thumbnail size from that gallery and they need to be clicked on for full resolution. If i wanted to save all images from each gallery in a select number of pages or all pages of my search results. Which would be the better option and more beginner friendly.
Thanks
ji0p9c
webscraping
Chuck_You
t3_ji0p9c
https://www.reddit.com/r/webscraping/comments/ji0p9c/parsehub_vs_octoparse/
10/25/2020 8:39:29 PM
10/25/2020 8:44:07 PM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Parsehub vs Octoparse
False
1
ji0p9c
0
1
1
1
3
3.15789473684211
0
0
0
0
40
42.1052631578947
95
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
518
Posted
6/26/2021 5:40:39 AM
https://www.octoparse.com/
o84rk4
programmingtools
AsimRazaJalbani
t3_o84rk4
https://www.reddit.com/r/programmingtools/comments/o84rk4/web_scraping_tool_free_web_crawlers_octoparse/
6/26/2021 5:40:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping Tool & Free Web Crawlers - Octoparse making web scraping procedure easy and no need for any coding language to operate this amazing tool.
False
1
o84rk4
0
4
1
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
517
Posted
6/26/2021 5:40:39 AM
https://www.octoparse.com/
o84rk4
programmingtools
AsimRazaJalbani
t3_o84rk4
https://www.reddit.com/r/programmingtools/comments/o84rk4/web_scraping_tool_free_web_crawlers_octoparse/
6/26/2021 5:40:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping Tool & Free Web Crawlers - Octoparse making web scraping procedure easy and no need for any coding language to operate this amazing tool.
False
1
o84rk4
0
4
1
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
516
Posted
10/20/2021 6:07:19 PM
https://i.redd.it/dvd9mve69nu71.jpg
qc70r4
poland
AnnaHrytsyk
t3_qc70r4
https://www.reddit.com/r/poland/comments/qc70r4/welovenocode_is_looking_for_nocode_developers_and/
10/20/2021 6:07:19 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
WELOVENOCODE is looking for nocode developers and specialists of ZAPIER/ CALCAPP/ COVERKIT/ DROPSOURCE/ FYLAMYNT/ KLAVIYO/ MURKSTOM/ OCTOPARSE/ PAYHIP/ PAZLY/ SHEETSU/ WEGLOT/ AMPTSTOR/ POWER IMPORTER/ UBOT STUDIO to join our team ASAP. If you are interested, contact me through Telegram:@AnnaG007
False
0.5
qc70r4
0
4
1
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
515
Posted
10/20/2021 6:07:19 PM
https://i.redd.it/dvd9mve69nu71.jpg
qc70r4
poland
AnnaHrytsyk
t3_qc70r4
https://www.reddit.com/r/poland/comments/qc70r4/welovenocode_is_looking_for_nocode_developers_and/
10/20/2021 6:07:19 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
WELOVENOCODE is looking for nocode developers and specialists of ZAPIER/ CALCAPP/ COVERKIT/ DROPSOURCE/ FYLAMYNT/ KLAVIYO/ MURKSTOM/ OCTOPARSE/ PAYHIP/ PAZLY/ SHEETSU/ WEGLOT/ AMPTSTOR/ POWER IMPORTER/ UBOT STUDIO to join our team ASAP. If you are interested, contact me through Telegram:@AnnaG007
False
0.5
qc70r4
0
4
1
1
128, 128, 128
3
Solid
50
No
514
Posted
8/5/2022 11:45:46 AM
If you're preparing to establish a new job as a data analyst, you've likely encountered an age-old conundrum. How can a novice data analyst acquire experience if they are applying for their first data analysis position?
Here, your portfolio comes into play. The projects you include in your portfolio illustrate your skills and experience to hiring managers and interviewees, even if they are not from a previous data analytics position. Even if you lack past work experience, populating your portfolio with the correct projects can go a long way toward establishing confidence that you are the ideal candidate for the position.
In this post, we will explore five sorts of data analytics projects that you should include in your portfolio, particularly if you are just starting out. You will find examples of how these projects are presented in actual portfolios, as well as a list of public data sets you can use to initiate project completion.
**Ideas for data analysis projects**
As a prospective data analyst, your portfolio should highlight a few essential abilities. These project ideas for data analytics illustrate the duties that are generally basic to the majority of data analyst employment.
**Web scraping**
Although there is no shortage of fantastic (and free) public data sets on the internet, you may want to demonstrate to potential employers that you can also locate and scrape your own data. In addition, learning how to scrape web data enables you to locate and utilise data sets that correspond to your interests, regardless of whether they have already been assembled.
If you are familiar with Python, you can use programmes such as Beautiful Soup or Scrapy to crawl the web in search of useful data. Don't worry if you don't know how to code. You will also find various tools that automate the process, such as Octoparse and ParseHub (many of which provide a free trial).
**Data maintenance**
As a data analyst, you are responsible for cleaning data to prepare it for analysis. Data cleaning (also known as data scrubbing) is the process of deleting inaccurate and duplicate data, controlling data gaps, and ensuring uniform data formatting.
**Exploratory data analysis (EDA)**
The essence of data analysis is answering questions with data. EDA, or exploratory data analysis, helps you determine what questions to ask. This could be done independently or in tandem with data cleansing. Regardless, you should complete the following during these preliminary inquiries.
1. Ask numerous questions regarding the data.
2. Determine the underlying data structure.
3. Examine the data for trends, patterns, and abnormalities.
4. Validate hypotheses and hypotheses based on the data.
5. Consider what problems you might be able to solve with the data.
**Sentimental analysis**
Typically performed on textual data, sentiment analysis is a natural language processing (NLP) technique that determines if data is neutral, positive, or negative. It can also be used to identify a specific emotion based on a list of words and their associated emotions (known as a lexicon).
This type of analysis functions well on public review sites and social media platforms, where individuals are likely to express their thoughts on a variety of topics.
**Data visualisation**
Humans are visual creatures. Thus, data visualisation is a potent tool for translating facts into a captivating narrative to inspire action. Not only are great visualisations enjoyable to produce, but they also have the ability to make your portfolio look stunning.
The Required Data Analyst Skills are highly comprehensive and cover a vast array of Data Management Process domains. If you can check off every item on the Data Analyst Skills Checklist, you have a decent chance of breaking into the profession of Data Analytics, which is one of the most in-demand in the computing business.
Given the job outlook for Data Analysts, it could be advisable to pursue one. As a Data Analyst specialist at Syntax Technologies, you have a fantastic chance to advance your abilities. We help you develop Data Analyst Skills consistent with industry standards and technological expectations. Enroll now in our [Data Analytics course.](https://www.syntaxtechs.com/courses/data-analytics-and-business-intelligence-training-course-online)
wgtcy8
u_syntaxtechnologies17
syntaxtechnologies17
t3_wgtcy8
https://www.reddit.com/r/u_syntaxtechnologies17/comments/wgtcy8/data_analytics_projects_for_beginners/
8/5/2022 11:45:46 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Data Analytics Projects for Beginners
False
1
wgtcy8
0
1
1
1
24
3.49344978165939
7
1.01892285298399
0
0
346
50.3639010189229
687
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
513
Posted
6/21/2022 4:01:49 PM
Hi,
I want to use the extension API connector to pull data automatically from Octoparse.
Are there any tutorials on how to do it coz I'm getting super confused?
vhgou1
googlesheets
Real_Strategy3314
t3_vhgou1
https://www.reddit.com/r/googlesheets/comments/vhgou1/octoparse_using_api_connector_extension/
6/21/2022 4:01:49 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse using API connector extension
False
1
vhgou1
0
4
9
9
1
3.44827586206897
1
3.44827586206897
0
0
12
41.3793103448276
29
128, 128, 128
3
Solid
50
No
512
Commented
6/21/2022 4:01:51 PM
Posting your data can make it easier for others to help you, but it looks like your submission doesn't include any. If this is the case and data would help, you can read how to include it in the [submission guide](https://www.reddit.com/r/googlesheets/wiki/postguide#wiki_posting_your_data). You can also use this tool created by a Reddit community member to [create a blank Google Sheets document](https://docs.google.com/forms/d/e/1FAIpQLSeprZS3Al0n7JiVQIEiCi_Ad9FRXbpgB7x1-Wq6iAfdmVbWiA/viewform) that isn't connected to your account. Thank you.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/googlesheets) if you have any questions or concerns.*
id71w10
googlesheets
AutoModerator
t1_id71w10
https://www.reddit.com/r/googlesheets/comments/vhgou1/octoparse_using_api_connector_extension/id71w10/
6/21/2022 4:01:51 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
vhgou1
t3_vhgou1
vhgou1
0
vhgou1
False
False
False
0
1
9
9
2
1.72413793103448
1
0.862068965517241
0
0
51
43.9655172413793
116
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
511
Posted
6/21/2022 4:02:14 PM
Hi, I want to use the extension API connector to pull data automatically from Octoparse. Are there any tutorials on how to do it coz I'm getting super confused?
vhgp60
webscraping
Real_Strategy3314
t3_vhgp60
https://www.reddit.com/r/webscraping/comments/vhgp60/octoparse_using_api_connector_extension/
6/21/2022 4:02:14 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse using API connector extension
False
1
vhgp60
0
4
9
9
1
3.44827586206897
1
3.44827586206897
0
0
12
41.3793103448276
29
128, 128, 128
3
Solid
50
No
510
Commented
6/21/2022 6:06:38 PM
Depends on the [API](https://www.octoparse.com/tutorial/advanced-api#) from octopase. You will find the authorization, endpoints and methods there.
id7ivaf
googlesheets
RemcoE33
t1_id7ivaf
https://www.reddit.com/r/googlesheets/comments/vhgou1/octoparse_using_api_connector_extension/id7ivaf/
6/21/2022 6:06:38 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
vhgou1
t3_vhgou1
vhgou1
0
vhgou1
False
False
False
0
1
9
9
1
4.54545454545455
0
0
0
0
10
45.4545454545455
22
128, 128, 128
3
Solid
50
Yes
508
Commented
4/3/2023 4:14:25 PM
each website has it own structure so u well need to add custom code for each website ,
and for json it's not language it's just type of file where you store data as specific structure
if you need someone to do job for you u can pm me with website or two I'll see
jesv7cg
webscraping
trafalgarDxlaw
t1_jesv7cg
https://www.reddit.com/r/webscraping/comments/12aoca8/scraping_off_promos_on_shopping_sites/jesv7cg/
4/3/2023 4:14:25 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
12aoca8
t3_12aoca8
12aoca8
1
12aoca8
False
False
False
0
1
4
4
1
1.85185185185185
0
0
0
0
24
44.4444444444444
54
128, 128, 128
3
Solid
50
Yes
507
RepliedTo
4/3/2023 7:37:08 PM
Awesome, websites coming your way!
jetq5kh
webscraping
allchoppedup
t1_jetq5kh
https://www.reddit.com/r/webscraping/comments/12aoca8/scraping_off_promos_on_shopping_sites/jetq5kh/
4/3/2023 7:37:08 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jesv7cg
t1_jesv7cg
jesv7cg
0
12aoca8
True
False
False
1
1
4
4
1
20
0
0
0
0
3
60
5
128, 128, 128
3
Solid
50
No
509
Posted
4/3/2023 3:45:39 PM
Non-coder here. I’ve been trying to use Octoparse to scrape sales from online retailers (think “30% off today only at Banana Republic”). Mostly from their sales page (if there is one) or small banners at the top of the site and exporting the selected text to Google sheets with Zapier.
Obviously there are issues like the websites change where they put the sales text so the pulls aren’t always accurate.
What’s my best option here long term, hire a professional scraper to do via json or python?
12aoca8
webscraping
allchoppedup
t3_12aoca8
https://www.reddit.com/r/webscraping/comments/12aoca8/scraping_off_promos_on_shopping_sites/
4/3/2023 3:45:39 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping % off promos on shopping sites
False
1
12aoca8
0
1
4
4
3
3.2967032967033
1
1.0989010989011
0
0
44
48.3516483516484
91
128, 128, 128
3
Solid
50
No
506
Commented
10/14/2022 12:04:51 PM
Something like node-red could also work.
isa1bhb
selfhosted
Alx_xl
t1_isa1bhb
https://www.reddit.com/r/selfhosted/comments/y22spz/self_hosted_webscraper/isa1bhb/
10/14/2022 12:04:51 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
y22spz
t3_y22spz
y22spz
0
y22spz
False
False
False
0
1
8
8
1
14.2857142857143
0
0
0
0
3
42.8571428571429
7
128, 128, 128
3
Solid
50
No
505
Commented
10/13/2022 1:02:17 PM
I simply use Wget with Cronjobs
is5ekyf
selfhosted
Alfagun74
t1_is5ekyf
https://www.reddit.com/r/selfhosted/comments/y22spz/self_hosted_webscraper/is5ekyf/
10/13/2022 1:02:17 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
y22spz
t3_y22spz
y22spz
0
y22spz
False
False
False
0
1
8
8
0
0
0
0
0
0
4
66.6666666666667
6
128, 128, 128
3
Solid
50
No
504
Commented
10/12/2022 7:51:43 PM
I think you can do this with Huginn.
is29sc1
selfhosted
ok-until-you-arrived
t1_is29sc1
https://www.reddit.com/r/selfhosted/comments/y22spz/self_hosted_webscraper/is29sc1/
10/12/2022 7:51:43 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
y22spz
t3_y22spz
y22spz
0
y22spz
False
False
False
0
1
8
8
0
0
0
0
0
0
2
25
8
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
756
Posted
1/3/2023 7:07:18 PM
Does anyone know something similar like Octoparse but free and selfhosted ? I tried changedetection but it is not the same.
Thank you very much for help.
102gfg4
selfhosted
Sinclairxer
t3_102gfg4
https://www.reddit.com/r/selfhosted/comments/102gfg4/selfhosted_web_scraper/
1/3/2023 7:07:18 PM
1/1/0001 12:00:00 AM
False
False
8
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Self-hosted web scraper?
False
0.8
102gfg4
0
4
8
8
2
7.69230769230769
0
0
0
0
12
46.1538461538462
26
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
755
Posted
10/12/2022 12:36:19 PM
Hello.
Does anyone know self hosted web scraper ?
Something similar like Octoparse.
Thank you.
y22spz
selfhosted
Sinclairxer
t3_y22spz
https://www.reddit.com/r/selfhosted/comments/y22spz/self_hosted_webscraper/
10/12/2022 12:36:19 PM
1/1/0001 12:00:00 AM
False
False
3
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Self hosted web-scraper?
False
0.72
y22spz
0
4
8
8
1
7.14285714285714
0
0
0
0
10
71.4285714285714
14
128, 128, 128
3
Solid
50
No
503
Commented
10/12/2022 6:55:03 PM
Yacy offers this https://github.com/yacy/yacy_search_server
is20yk3
selfhosted
zumtest99
t1_is20yk3
https://www.reddit.com/r/selfhosted/comments/y22spz/self_hosted_webscraper/is20yk3/
10/12/2022 6:55:03 PM
1/1/0001 12:00:00 AM
False
False
4
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
y22spz
t3_y22spz
y22spz
0
y22spz
False
False
False
0
1
8
8
0
0
0
0
0
0
2
66.6666666666667
3
128, 128, 128
3
Solid
50
No
501
Commented
5/11/2019 2:35:52 PM
I’m using Rvest and Rselenium. Been considering moving to beautiful soup? Idk I worry it’s sunk cost fallacy and I’m only using r because I invested the time to learn to scrape in R
en4eiyf
scrapinghub
RollinDeepWithData
t1_en4eiyf
https://www.reddit.com/r/scrapinghub/comments/a4p6bh/scraping_software/en4eiyf/
5/11/2019 2:35:52 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
a4p6bh
t3_a4p6bh
a4p6bh
0
a4p6bh
False
False
False
0
1
14
14
1
2.7027027027027
3
8.10810810810811
0
0
13
35.1351351351351
37
128, 128, 128
3
Solid
50
No
500
Commented
1/1/2019 7:54:37 AM
I do everything in python:
* requests for single pages
* scrapy for recursive crawling or big projects
* selenium for dynamic websites
* lxml/bs4 for parsing
ed01ooo
scrapinghub
rugantio
t1_ed01ooo
https://www.reddit.com/r/scrapinghub/comments/a4p6bh/scraping_software/ed01ooo/
1/1/2019 7:54:37 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
a4p6bh
t3_a4p6bh
a4p6bh
0
a4p6bh
False
False
False
0
1
14
14
1
4.16666666666667
0
0
0
0
14
58.3333333333333
24
128, 128, 128
3
Solid
50
No
499
Commented
12/10/2018 6:38:40 PM
For small projects just selenium/webdriver.
Also scrapy for larger projects.
ebielfz
scrapinghub
TriggazTilt
t1_ebielfz
https://www.reddit.com/r/scrapinghub/comments/a4p6bh/scraping_software/ebielfz/
12/10/2018 6:38:40 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
a4p6bh
t3_a4p6bh
a4p6bh
0
a4p6bh
False
False
False
0
1
14
14
0
0
0
0
0
0
7
63.6363636363636
11
128, 128, 128
3
Solid
50
No
498
Commented
12/10/2018 12:35:38 PM
Have you used (and compared) Octopart with other services?. Curious to understand how it became your tool of choice for scraping.
ebhqca9
scrapinghub
pablohoffman
t1_ebhqca9
https://www.reddit.com/r/scrapinghub/comments/a4p6bh/scraping_software/ebhqca9/
12/10/2018 12:35:38 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
a4p6bh
t3_a4p6bh
a4p6bh
0
a4p6bh
False
False
False
0
1
14
14
0
0
0
0
0
0
10
47.6190476190476
21
128, 128, 128
3
Solid
50
No
502
Posted
12/9/2018 9:51:05 PM
Hey fam jam!
&#x200B;
Just out of curiosity, what is everyone using to scrape web data?
&#x200B;
I am currently using Octoparse.
&#x200B;
The reason I ask is because I would love to connect with more people who are using this scraping service to learn from others.
a4p6bh
scrapinghub
Ankerstein17
t3_a4p6bh
https://www.reddit.com/r/scrapinghub/comments/a4p6bh/scraping_software/
12/9/2018 9:51:05 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping Software
False
1
a4p6bh
0
1
14
14
1
2
1
2
0
0
24
48
50
128, 128, 128
3
Solid
50
No
497
Commented
12/10/2018 3:58:13 AM
I am starting to use it. How do you like it so far?
ebh5uec
scrapinghub
joyisbrightcolors
t1_ebh5uec
https://www.reddit.com/r/scrapinghub/comments/a4p6bh/scraping_software/ebh5uec/
12/10/2018 3:58:13 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
a4p6bh
t3_a4p6bh
a4p6bh
0
a4p6bh
False
False
False
0
1
14
14
0
0
0
0
0
0
3
23.0769230769231
13
128, 128, 128
3
Solid
50
Yes
494
Commented
2/18/2021 6:07:45 PM
Data like this?
* [Tournament Listing](https://gist.github.com/thegrif/964c3069e350ce427954a90d19a0f2fb)
* [Tournament Details](https://gist.github.com/thegrif/77b5b01baa56f0ddb2d4eece0b80d1e2)
gnwtke6
webscraping
thegrif
t1_gnwtke6
https://www.reddit.com/r/webscraping/comments/lmpgsc/im_looking_for_effective_web_scraping_methods_and/gnwtke6/
2/18/2021 6:07:45 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lmpgsc
t3_lmpgsc
lmpgsc
1
lmpgsc
False
False
False
0
1
28
28
0
0
0
0
0
0
13
68.4210526315789
19
128, 128, 128
3
Solid
50
Yes
493
RepliedTo
2/18/2021 6:25:22 PM
Yes! That's what I need to extract, how did you do it?
gnww88g
webscraping
juanchi_parra
t1_gnww88g
https://www.reddit.com/r/webscraping/comments/lmpgsc/im_looking_for_effective_web_scraping_methods_and/gnww88g/
2/18/2021 6:25:22 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gnwtke6
t1_gnwtke6
gnwtke6
0
lmpgsc
True
False
False
1
1
28
28
0
0
0
0
0
0
3
25
12
128, 128, 128
3
Solid
50
No
492
Commented
2/18/2021 4:24:15 PM
You could try the webscraper plugin (webscraper.io)
gnwefo6
webscraping
warrior_321
t1_gnwefo6
https://www.reddit.com/r/webscraping/comments/lmpgsc/im_looking_for_effective_web_scraping_methods_and/gnwefo6/
2/18/2021 4:24:15 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lmpgsc
t3_lmpgsc
lmpgsc
0
lmpgsc
False
False
False
0
1
28
28
0
0
0
0
0
0
5
62.5
8
128, 128, 128
3
Solid
50
No
491
Commented
10/25/2022 10:04:16 PM
So if I want to get the address and phone number of convenience stores from the companies website, I can web scrape the data as long as the tops doesn't mention anything against web scraping. What do I look for in the robots.txt file to let me know if I can scrape data from the website?
its46qs
Octoparse_ideas
No-Crew-4297
t1_its46qs
https://www.reddit.com/r/Octoparse_ideas/comments/w1u5vx/is_web_scraping_legal_and_why/its46qs/
10/25/2022 10:04:16 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
w1u5vx
t3_w1u5vx
w1u5vx
0
w1u5vx
False
False
False
0
1
3
3
2
3.50877192982456
0
0
0
0
23
40.3508771929825
57
128, 128, 128
3
Solid
50
No
489
Commented
2/4/2023 8:42:02 AM
I learned python and I'm originally from mkt. So try it for the long term.
j75wqhp
webscraping
Old_Flounder_8640
t1_j75wqhp
https://www.reddit.com/r/webscraping/comments/10sqtgh/question_from_a_noncoder/j75wqhp/
2/4/2023 8:42:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10sqtgh
t3_10sqtgh
10sqtgh
0
10sqtgh
False
False
False
0
1
29
29
0
0
0
0
0
0
7
46.6666666666667
15
128, 128, 128
3
Solid
50
No
488
Commented
2/3/2023 8:29:57 PM
It all depends on how the site is made. Some sites really don't want you to scrape them, and so they put a lot of effort into making it difficult to do. Others don't care. Many offer public APIs so that you can interact with it programmatically in a easy, controlled and official way.
j73j3rq
webscraping
ajt9000
t1_j73j3rq
https://www.reddit.com/r/webscraping/comments/10sqtgh/question_from_a_noncoder/j73j3rq/
2/3/2023 8:29:57 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10sqtgh
t3_10sqtgh
10sqtgh
0
10sqtgh
False
False
False
0
1
29
29
1
1.85185185185185
1
1.85185185185185
0
0
21
38.8888888888889
54
128, 128, 128
3
Solid
50
No
490
Posted
2/3/2023 6:04:17 PM
Hello,
I wonder if anyone can enlighten me on web scraping.
As a non-coder in a sales and marketing role, I am considering a couple of scraping platforms for the purpose of grabbing names, addresses and phone numbers + maybe even emails from websites as sales leads.
I’ve looked at Parsehub and Octoparse and they look like they would be easy enough.
However, after reading through the posts on this sub, and becoming aware of how many obstacles you guys come across every day, I’m beginning to think that these platforms are overly optimistic about what can be accomplished if you can’t write code.
Anyone care to comment? There’s no money in the budget to hire the work out. I was hoping to solve this challenge with a cheap subscription.
10sqtgh
webscraping
Strokesite
t3_10sqtgh
https://www.reddit.com/r/webscraping/comments/10sqtgh/question_from_a_noncoder/
2/3/2023 6:04:17 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Question from a non-coder
False
1
10sqtgh
0
1
29
29
8
5.97014925373134
1
0.746268656716418
0
0
53
39.5522388059701
134
128, 128, 128
3
Solid
50
No
487
Commented
2/3/2023 6:42:17 PM
I was recently exposed to a scraping chrome extension called Data Miner. Its free version is pretty limited, but if all you want is to collect names, emails and such then I'd say its pretty okay.
Ultimately though, it depends on whatever site you're hoping to scrape from. Sometimes, a no-code tool just won't be enough.
If you're willing to put in extra time and effort, I'd recommend you check out John Watson Rooney on YouTube for more scraping info. He uses python, which is very beginner-friendly, and he explains really well too. So that may be your best bet if you don't wanna hire someone.
Within a few days you should be able to set up very basic scrapers for your needs, depending on the complexity of the sites you wish to scrape from.
j73242e
webscraping
ARandomBoiIsMe
t1_j73242e
https://www.reddit.com/r/webscraping/comments/10sqtgh/question_from_a_noncoder/j73242e/
2/3/2023 6:42:17 PM
2/3/2023 6:54:58 PM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10sqtgh
t3_10sqtgh
10sqtgh
0
10sqtgh
False
False
False
0
1
29
29
9
6.56934306569343
1
0.72992700729927
0
0
58
42.3357664233577
137
128, 128, 128
3
Solid
50
No
485
Commented
3/30/2023 10:07:44 PM
I haven't used Octoparse but I have had multiple people complain about it to me.
In my opinion, no code, low code tools to scrape a general website don't work well. That is because most websites are actively trying to stop webscraping.
Your best bet is to hire someone to do it if you can't write the scraper yourself.
jebyawq
webscraping
GullibleEngineer4
t1_jebyawq
https://www.reddit.com/r/webscraping/comments/125sdvh/octoparse_is_one_of_the_most_frustrating_programs/jebyawq/
3/30/2023 10:07:44 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
125sdvh
t3_125sdvh
125sdvh
0
125sdvh
False
False
False
0
1
4
4
3
5.08474576271186
1
1.69491525423729
0
0
24
40.6779661016949
59
128, 128, 128
3
Solid
50
No
483
Commented
3/31/2023 1:41:13 PM
I would avoid using Octoparse and just build your own scrapers and find a good proxy provider.
jeemxc2
webscraping
gordongekko16
t1_jeemxc2
https://www.reddit.com/r/webscraping/comments/125sdvh/octoparse_is_one_of_the_most_frustrating_programs/jeemxc2/
3/31/2023 1:41:13 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
125sdvh
t3_125sdvh
125sdvh
0
125sdvh
False
False
False
0
1
4
4
1
5.88235294117647
0
0
0
0
8
47.0588235294118
17
128, 128, 128
3
Solid
50
No
486
Posted
3/29/2023 4:03:18 PM
I am almost done with this program. I am attempting to conduct a project scraping information regarding ChatGPT from the perspective of numerous different groups and every time I attempt to make a scrape it seemingly fails for no reason. The auto-detection feature doesn't detect relevant information for me to scrape and when I make even minor changes to the workflow the entire scrape fails. This program is so user unfriendly I am at a loss as to how someone not already familiar with scraping could learn how to use it effectively. Does anyone have any advice on where I could find adequate tutorials on how to use Octoparse? What I have found and watched has not been helpful at all.
125sdvh
webscraping
DmitriySokol
t3_125sdvh
https://www.reddit.com/r/webscraping/comments/125sdvh/octoparse_is_one_of_the_most_frustrating_programs/
3/29/2023 4:03:18 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse is one of the most frustrating programs I've used in my entire life
False
1
125sdvh
0
1
4
4
3
2.45901639344262
4
3.27868852459016
0
0
48
39.344262295082
122
128, 128, 128
3
Solid
50
No
484
Commented
3/30/2023 7:26:54 AM
Then don't use it
je8wjhn
webscraping
RobSm
t1_je8wjhn
https://www.reddit.com/r/webscraping/comments/125sdvh/octoparse_is_one_of_the_most_frustrating_programs/je8wjhn/
3/30/2023 7:26:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
125sdvh
t3_125sdvh
125sdvh
0
125sdvh
False
False
False
0
1
4
4
0
0
0
0
0
0
1
25
4
128, 128, 128
3
Solid
50
Yes
482
Commented
3/30/2023 4:26:06 AM
What are you trying to do? I'm not very familiar with octoparse
je8hq1d
webscraping
05_legend
t1_je8hq1d
https://www.reddit.com/r/webscraping/comments/125sdvh/octoparse_is_one_of_the_most_frustrating_programs/je8hq1d/
3/30/2023 4:26:06 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
125sdvh
t3_125sdvh
125sdvh
1
125sdvh
False
False
False
0
1
4
4
0
0
0
0
0
0
4
33.3333333333333
12
128, 128, 128
3
Solid
50
Yes
481
RepliedTo
3/30/2023 5:50:08 PM
I am attempting to scrape posts discussing ChatGPT from the perspective of numerous communities (Educators, Investors, Developers, etc.) Octoparse allows one to rapidly detect certain elements of a webpage and download them all into an Excel sheet. My frustrations lie with the logic behind the workflow system and the lack of tutorials for anything more than the most rudimentary of scrapes. It also features an automatic detection function to circumvent this issue, but it rarely detects the information I want, sometimes even ignoring the contents of posts on web forums. I was hoping someone knew of any tutorials or tips they found particularly helpful so I can understand the program better.
jeau4hq
webscraping
DmitriySokol
t1_jeau4hq
https://www.reddit.com/r/webscraping/comments/125sdvh/octoparse_is_one_of_the_most_frustrating_programs/jeau4hq/
3/30/2023 5:50:08 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
je8hq1d
t1_je8hq1d
je8hq1d
0
125sdvh
True
False
False
1
1
4
4
3
2.7027027027027
4
3.6036036036036
0
0
52
46.8468468468468
111
128, 128, 128
3
Solid
50
No
479
Commented
8/15/2018 4:32:19 PM
Why are you yelling? And that is not expert level stuff, that's "I took 2 seconds to read the Getting Started doc". Try.
e48o7ft
aws
cr125rider
t1_e48o7ft
https://www.reddit.com/r/aws/comments/97jo70/amazon_s3_expert_needed/e48o7ft/
8/15/2018 4:32:19 PM
1/1/0001 12:00:00 AM
False
False
17
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
97jo70
t3_97jo70
97jo70
0
97jo70
False
False
False
0
1
39
39
0
0
0
0
0
0
11
47.8260869565217
23
128, 128, 128
3
Solid
50
No
480
Posted
8/15/2018 4:25:12 PM
Hey everyone!
I'm fairly new to coding and have been working on a project for work. I'm using Octoparse and have been scraping websites into JSON files, however, my boss wants me to upload them into S3 buckets every day.
Ideally, I would like an API or something to automate this...
Anyone got any ideas, or someone I would be able to talk to?
97jo70
aws
Clarkeyyyy
t3_97jo70
https://www.reddit.com/r/aws/comments/97jo70/amazon_s3_expert_needed/
8/15/2018 4:25:12 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
AMAZON S3 EXPERT NEEDED
False
0.15
97jo70
0
1
39
39
3
4.6875
0
0
0
0
23
35.9375
64
128, 128, 128
3
Solid
50
No
478
Commented
8/15/2018 4:29:58 PM
There's plenty of documentation for a simple use case like this - https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html
e48o1jz
aws
exvancouverite
t1_e48o1jz
https://www.reddit.com/r/aws/comments/97jo70/amazon_s3_expert_needed/e48o1jz/
8/15/2018 4:29:58 PM
1/1/0001 12:00:00 AM
False
False
7
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
97jo70
t3_97jo70
97jo70
0
97jo70
False
False
False
0
1
39
39
0
0
0
0
0
0
5
45.4545454545455
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
477
Posted
12/10/2020 11:16:21 PM
no coding automation
kaq4p2
Automation_Central
Dan_Automation_Man
t3_kaq4p2
https://www.reddit.com/r/Automation_Central/comments/kaq4p2/got_to_love_event_ghost_octoparse_and_katolon/
12/10/2020 11:16:21 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
got to love event ghost octoparse and katolon studio for windows gui automation
False
1
kaq4p2
0
4
1
1
0
0
0
0
0
0
2
66.6666666666667
3
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
476
Posted
12/10/2020 11:16:00 PM
kaq4f7
Automation_Central
Dan_Automation_Man
t3_kaq4f7
https://www.reddit.com/r/Automation_Central/comments/kaq4f7/got_to_love_event_ghost_octoparse_and_katolon/
12/10/2020 11:16:00 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
got to love event ghost octoparse and katolon studio for windows gui automation
False
1
kaq4f7
0
4
1
1
128, 128, 128
3
Solid
50
No
475
Posted
11/23/2021 1:52:16 AM
Unter Web Scraping versteht man das Extrahieren der Daten von beliebiger Website in einem strukturierten Format wie CSV und Excel. In diesem Artikel werden die 25 beliebtesten Anwendungen von Web Scraping vorgestellt, mit denen Sie Ihr Geschäft ausbauen können.
[https://octoparse.de/blog/die-25-beliebtesten-anwendungen-von-web-scraping](https://octoparse.de/blog/die-25-beliebtesten-anwendungen-von-web-scraping)
r01wgm
WebScrapingDe
OctoparseDe
t3_r01wgm
https://www.reddit.com/r/WebScrapingDe/comments/r01wgm/die_25_beliebtesten_anwendungen_von_web_scraping/
11/23/2021 1:52:16 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Die 25 beliebtesten Anwendungen von Web Scraping
False
1
r01wgm
0
1
1
1
1
1.63934426229508
3
4.91803278688525
0
0
34
55.7377049180328
61
128, 128, 128
3
Solid
50
No
474
Posted
6/13/2021 8:47:42 PM
Hello. I am looking to scrape a single page of plain text data from a website (on a recurring schedule- once daily). The webpage consists of a table consisting of approx.165 words, 15 rows and 4 columns. I would need the data transferred to google sheets in the same table format. The webpage that needs scraped daily: [http://louey.org/availability.php](http://louey.org/availability.php)
I looked into services like Octoparse and Parsehub, but they require a plan subscription (for scheduled scrapes) which far exceeds the amount I can pay for a single task like this.
Please message me if you would be interested
nz5la6
scrapy
rising_gmni
t3_nz5la6
https://www.reddit.com/r/scrapy/comments/nz5la6/looking_to_hire_someone_for_a_simple_scrape_that/
6/13/2021 8:47:42 PM
1/1/0001 12:00:00 AM
False
False
7
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Looking to hire someone for a simple scrape that needs to be scheduled daily
False
1
nz5la6
0
1
55
55
1
0.934579439252336
0
0
0
0
59
55.1401869158879
107
128, 128, 128
3
Solid
50
No
473
Commented
6/14/2021 11:22:46 AM
I'd be happy to help you. Please, DM me with the format you want in the spreadsheet.
h1pwluv
scrapy
devcorp101
t1_h1pwluv
https://www.reddit.com/r/scrapy/comments/nz5la6/looking_to_hire_someone_for_a_simple_scrape_that/h1pwluv/
6/14/2021 11:22:46 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
nz5la6
t3_nz5la6
nz5la6
0
nz5la6
False
False
False
0
1
55
55
1
5.88235294117647
0
0
0
0
5
29.4117647058824
17
128, 128, 128
3
Solid
50
Yes
471
Commented
12/16/2022 5:19:11 AM
Instead of product name can you grab product code / SKU instead?
j0fa2jm
consulting
PrivateEquityAdvisor
t1_j0fa2jm
https://www.reddit.com/r/consulting/comments/zmd2s4/ecommerce_price_scraping/j0fa2jm/
12/16/2022 5:19:11 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zmd2s4
t3_zmd2s4
zmd2s4
1
zmd2s4
False
False
False
0
1
18
18
0
0
0
0
0
0
8
72.7272727272727
11
128, 128, 128
3
Solid
50
Yes
470
RepliedTo
12/16/2022 8:47:18 AM
It's possible but wouldn't the product code/SKU differ across the competitors?
j0fr8pk
consulting
Mysterious-Airline1
t1_j0fr8pk
https://www.reddit.com/r/consulting/comments/zmd2s4/ecommerce_price_scraping/j0fr8pk/
12/16/2022 8:47:18 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j0fa2jm
t1_j0fa2jm
j0fa2jm
1
zmd2s4
True
False
False
1
1
18
18
0
0
0
0
0
0
6
50
12
128, 128, 128
3
Solid
50
No
469
RepliedTo
12/16/2022 4:29:25 PM
Almost certainly yes, but would not take you long to figure out. It's not sexy like NLP, but it's the "quick and dirty" way to accomplish your task if you're only looking into a subset of products.
If you really wanted to, you could farm this out to a bunch of contractors via MTurk or a similar service.
j0h42dm
consulting
Wonder-Barr
t1_j0h42dm
https://www.reddit.com/r/consulting/comments/zmd2s4/ecommerce_price_scraping/j0h42dm/
12/16/2022 4:29:25 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j0fr8pk
t1_j0fr8pk
j0fr8pk
0
zmd2s4
False
False
False
2
1
18
18
2
3.44827586206897
1
1.72413793103448
0
0
20
34.4827586206897
58
128, 128, 128
3
Solid
50
No
472
Posted
12/15/2022 5:23:59 AM
I'm launching an online grocery store soon, and at the moment, I am conducting an ecommerce competitor price analysis using Octoparse. There are about 5 competitors that I am comparing against.
The **challenge** is in the variation in product naming. Although most of the products are similar, the competitors have named the products differently or with slight variations.
**E.G. Dark Soya Sauce**
*Competitor 1 - 777 Dark Soya Sauce*
*Competitor 2 - Dark Soy Sauce*
*Competitor 3 - Soya Sauce Dark*
*Competitor 4 - Soya Sauce (Dark)*
Is there a way for me to **find a specific keyword** across the different competitiors and **scrape that particular price?**
E.G. Keyword - "Soya Sauce" or "Soya"
Would appreciate some guidance! Thank you.
zmd2s4
consulting
Mysterious-Airline1
t3_zmd2s4
https://www.reddit.com/r/consulting/comments/zmd2s4/ecommerce_price_scraping/
12/15/2022 5:23:59 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Ecommerce Price Scraping
False
0.33
zmd2s4
0
1
18
18
3
2.56410256410256
5
4.27350427350427
0
0
56
47.8632478632479
117
128, 128, 128
3
Solid
50
Yes
468
Commented
12/15/2022 6:29:37 PM
This sounds like a combination of keyword extraction (soy, dark, and sauce are the relevant key words in your example) and fuzzy matching (e.g., soy and soya differ by a single letter). You can look up both those terms to see what tools are offered for each and determine which best fits your use case.
Both of these concepts fall under the umbrella of natural language processing (NLP).
j0crrrc
consulting
econofit
t1_j0crrrc
https://www.reddit.com/r/consulting/comments/zmd2s4/ecommerce_price_scraping/j0crrrc/
12/15/2022 6:29:37 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zmd2s4
t3_zmd2s4
zmd2s4
2
zmd2s4
False
False
False
0
1
18
18
1
1.44927536231884
3
4.34782608695652
0
0
34
49.2753623188406
69
128, 128, 128
3
Solid
50
Yes
467
RepliedTo
12/16/2022 8:46:51 AM
Thank you for this excellent response! I'll have a look at the terms you suggested. Appreciate it!
j0fr7im
consulting
Mysterious-Airline1
t1_j0fr7im
https://www.reddit.com/r/consulting/comments/zmd2s4/ecommerce_price_scraping/j0fr7im/
12/16/2022 8:46:51 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j0crrrc
t1_j0crrrc
j0crrrc
0
zmd2s4
True
False
False
1
1
18
18
3
17.6470588235294
0
0
0
0
4
23.5294117647059
17
128, 128, 128
3
Solid
50
No
466
RepliedTo
12/15/2022 8:31:51 PM
This is a good answer ^^ (as someone who has done a lot of work with NLP)
j0datat
consulting
MarkusKruber
t1_j0datat
https://www.reddit.com/r/consulting/comments/zmd2s4/ecommerce_price_scraping/j0datat/
12/15/2022 8:31:51 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j0crrrc
t1_j0crrrc
j0crrrc
0
zmd2s4
False
False
False
1
1
18
18
2
12.5
0
0
0
0
5
31.25
16
128, 128, 128
3
Solid
50
Yes
464
Commented
4/24/2023 11:09:09 AM
Give website name and data you are trying to scrape
jhi70d0
webscraping
Kong_Don
t1_jhi70d0
https://www.reddit.com/r/webscraping/comments/12ww6yh/gui_scrapers_that_support_xpath_20/jhi70d0/
4/24/2023 11:09:09 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
12ww6yh
t3_12ww6yh
12ww6yh
1
12ww6yh
False
False
False
0
1
5
5
0
0
0
0
0
0
6
60
10
128, 128, 128
3
Solid
50
Yes
463
RepliedTo
4/24/2023 6:11:02 PM
As stated in my topic, I scrape similar data from thousands of websites, not the other way around. A rough example: scraping addresses or phone numbers from all of the websites from Google Search. If anything, that's not exactly what I'm scraping, but I'd like to keep it private. The idea is the same.
jhjs861
webscraping
thisisprice
t1_jhjs861
https://www.reddit.com/r/webscraping/comments/12ww6yh/gui_scrapers_that_support_xpath_20/jhjs861/
4/24/2023 6:11:02 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jhi70d0
t1_jhi70d0
jhi70d0
0
12ww6yh
True
False
False
1
1
5
5
0
0
1
1.85185185185185
0
0
24
44.4444444444444
54
128, 128, 128
3
Solid
50
Yes
462
Commented
4/24/2023 3:06:30 AM
You ask chatgpt?
jhh4k0t
webscraping
brownbottlecap
t1_jhh4k0t
https://www.reddit.com/r/webscraping/comments/12ww6yh/gui_scrapers_that_support_xpath_20/jhh4k0t/
4/24/2023 3:06:30 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
12ww6yh
t3_12ww6yh
12ww6yh
1
12ww6yh
False
False
False
0
1
5
5
0
0
0
0
0
0
2
66.6666666666667
3
128, 128, 128
3
Solid
50
Yes
461
RepliedTo
4/24/2023 6:04:55 PM
Yes, it always provides non-working queries, and it doesn't even know the limitations of XPath 1.0. Or well, maybe it does know them, but it doesn't apply them when coming up with a query, no matter how hard I tell it to do it.
jhjra9r
webscraping
thisisprice
t1_jhjra9r
https://www.reddit.com/r/webscraping/comments/12ww6yh/gui_scrapers_that_support_xpath_20/jhjra9r/
4/24/2023 6:04:55 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jhh4k0t
t1_jhh4k0t
jhh4k0t
0
12ww6yh
True
False
False
1
1
5
5
1
2.17391304347826
2
4.34782608695652
0
0
15
32.6086956521739
46
128, 128, 128
3
Solid
50
No
465
Posted
4/24/2023 12:24:24 AM
Hi guys. I'll try to explain the situation the best I can. I am not a coder, not even a web designer or anything like that, just an advanced PC user at best, but I still **really** need to do some complicated scraping that would involve regular expressions. It's not even for my job, but only for a hobby project, so it's not an option to hire a programmer to write a custom scraper for me. So I need to achieve it myself. The problem is that any scraping software that I come across (like Octoparse, ScrapeStorm, etc.) only supports XPath 1.0 and it's physically impossible to achieve what I need in those. My queries are too complicated as they involve scraping similar data from different websites instead of different data from the same website like it's usually done. I know it's possible, but I don't have the required skill (and time/patience to learn) to work with command line scripts and/or do some actual coding, so I'm desperately trying to find a ready-to-go GUI solution like Octoparse or ScrapeStorm, but it MUST support XPath 2.0 or newer. I took the time and nerves to build the actual queries that I need, and I know that they work, but I can't do anything with them without a scraper that supports them. Any advice? Huge thanks in advance.
P. S. I probably should explain why I need a GUI scraper if I already built my own queries. For two reasons. 1) Because I have no idea (and don't have the nerve to learn) how to replicate steps like "Click", "Loop", "Branch", etc. without a GUI. 2) Because I need to be able to visually see if my queries actually work before launching a task, and such software highlights the matched elements. I have no idea how I would work without all that.
12ww6yh
webscraping
thisisprice
t3_12ww6yh
https://www.reddit.com/r/webscraping/comments/12ww6yh/gui_scrapers_that_support_xpath_20/
4/24/2023 12:24:24 AM
4/24/2023 12:33:56 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
GUI scrapers that support XPath 2.0+
False
1
12ww6yh
0
1
5
5
13
4.11392405063291
5
1.58227848101266
0
0
125
39.5569620253165
316
128, 128, 128
3
Solid
50
Yes
460
Commented
4/24/2023 2:04:08 AM
> it's physically impossible to achieve what I need in those
I have suspicions about such a statement; how about you cook up an example of some HTML that is "physically impossible" to match with XPath 1.0 but that you can match with XPath 2.0
Now, your requirement that it be a "GUI scraper" very easily complicates things, so maybe it's their bug or something, but my request stands: I'm really curious to know what structure XPath 1.0 cannot match
jhgxaqm
webscraping
mdaniel
t1_jhgxaqm
https://www.reddit.com/r/webscraping/comments/12ww6yh/gui_scrapers_that_support_xpath_20/jhgxaqm/
4/24/2023 2:04:08 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
12ww6yh
t3_12ww6yh
12ww6yh
2
12ww6yh
False
False
False
0
1
5
5
0
0
4
4.8780487804878
0
0
31
37.8048780487805
82
128, 128, 128
3
Solid
50
Yes
459
RepliedTo
4/24/2023 6:07:42 PM
See e.g. this reply on Stackoverflow: https://stackoverflow.com/a/75963607/21485356
jhjrpnj
webscraping
thisisprice
t1_jhjrpnj
https://www.reddit.com/r/webscraping/comments/12ww6yh/gui_scrapers_that_support_xpath_20/jhjrpnj/
4/24/2023 6:07:42 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jhgxaqm
t1_jhgxaqm
jhgxaqm
0
12ww6yh
True
False
False
1
1
5
5
0
0
0
0
0
0
3
42.8571428571429
7
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
457
Commented
3/22/2021 9:15:13 PM
On the same note, I am currently writing on how terms rated to distance learning were used during 2020 part of the pandemic, I was wondering if there are any corpora similar to that posted above. Would be interesting to find some data from social media
gruungn
linguistics
Sedulas
t1_gruungn
https://www.reddit.com/r/linguistics/comments/max5ez/creating_a_twitter_corpus/gruungn/
3/22/2021 9:15:13 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
max5ez
t3_max5ez
max5ez
1
max5ez
False
False
False
0
2
17
17
1
2.17391304347826
0
0
0
0
22
47.8260869565217
46
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
456
RepliedTo
3/22/2021 9:38:21 PM
Did you search the website linked above? There is this - [https://www.kaggle.com/barishasdemir/tweets-about-distance-learning](https://www.kaggle.com/barishasdemir/tweets-about-distance-learning)
gruxk4m
linguistics
elles_bells_
t1_gruxk4m
https://www.reddit.com/r/linguistics/comments/max5ez/creating_a_twitter_corpus/gruxk4m/
3/22/2021 9:38:21 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gruungn
t1_gruungn
gruungn
1
max5ez
True
False
False
1
4
17
17
0
0
0
0
0
0
14
50
28
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
455
RepliedTo
3/22/2021 9:42:14 PM
Thanks, I posted my comment after seeing this link and I was wondering maybe there are similar ones, for example, collected from Facebook/LinkedIn or similar ones
gruy1ju
linguistics
Sedulas
t1_gruy1ju
https://www.reddit.com/r/linguistics/comments/max5ez/creating_a_twitter_corpus/gruy1ju/
3/22/2021 9:42:14 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gruxk4m
t1_gruxk4m
gruxk4m
1
max5ez
False
False
False
2
2
17
17
0
0
0
0
0
0
15
55.5555555555556
27
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
454
RepliedTo
3/22/2021 9:47:24 PM
Ahhh my bad haha hopefully there are!
gruyp43
linguistics
elles_bells_
t1_gruyp43
https://www.reddit.com/r/linguistics/comments/max5ez/creating_a_twitter_corpus/gruyp43/
3/22/2021 9:47:24 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gruy1ju
t1_gruy1ju
gruy1ju
0
max5ez
True
False
False
3
4
17
17
0
0
1
14.2857142857143
0
0
3
42.8571428571429
7
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
453
Commented
3/22/2021 8:31:33 PM
Yea that’s not that hard. But why bother when there are dozens of Twitter corpora available on the web?
gruozy4
linguistics
edwardsrk
t1_gruozy4
https://www.reddit.com/r/linguistics/comments/max5ez/creating_a_twitter_corpus/gruozy4/
3/22/2021 8:31:33 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
max5ez
t3_max5ez
max5ez
1
max5ez
False
False
False
0
2
17
17
1
5
2
10
0
0
5
25
20
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
452
RepliedTo
3/22/2021 8:47:25 PM
I haven't found any free access ones that suit what I need - do you have any recommendations?
grur2f1
linguistics
elles_bells_
t1_grur2f1
https://www.reddit.com/r/linguistics/comments/max5ez/creating_a_twitter_corpus/grur2f1/
3/22/2021 8:47:25 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gruozy4
t1_gruozy4
gruozy4
1
max5ez
True
False
False
1
4
17
17
2
11.7647058823529
0
0
0
0
6
35.2941176470588
17
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
451
RepliedTo
3/22/2021 8:50:51 PM
Also yea
https://www.kaggle.com/datasets?search=Tweet
You might also cross post this onto r/languagetechnology theyll be able to Better help you out using the Twitter api if none of these datasets appeal to yoi
grurir7
linguistics
edwardsrk
t1_grurir7
https://www.reddit.com/r/linguistics/comments/max5ez/creating_a_twitter_corpus/grurir7/
3/22/2021 8:50:51 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
grur2f1
t1_grur2f1
grur2f1
1
max5ez
False
False
False
2
2
17
17
2
6.45161290322581
0
0
0
0
12
38.7096774193548
31
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
450
RepliedTo
3/22/2021 8:59:53 PM
Brilliant thanks for your help!
grusp9l
linguistics
elles_bells_
t1_grusp9l
https://www.reddit.com/r/linguistics/comments/max5ez/creating_a_twitter_corpus/grusp9l/
3/22/2021 8:59:53 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
grurir7
t1_grurir7
grurir7
0
max5ez
True
False
False
3
4
17
17
1
20
0
0
0
0
2
40
5
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
562
Posted
3/20/2021 10:06:50 AM
This question might "straddle the line" so forgive me if it does.
Essentially, I would like to create a small corpus of tweets containing a keyword. I know its something that can be done in python but I would have to dedicate time to learning it so just want to know if there is an easier way first to do it.
I've tried Octoparse but it was a little limited. If anyone has any recommendations of programs do let me know! (or a easy guide on doing the whole python thing)
m9429a
learnprogramming
elles_bells_
t3_m9429a
https://www.reddit.com/r/learnprogramming/comments/m9429a/what_is_the_best_way_to_gather_a_database_of/
3/20/2021 10:06:50 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What is the best way to gather a database of tweets which contain a key word? Is there a program to use or should I just learn how to do it in python?
False
0.67
m9429a
0
4
17
17
3
3.2967032967033
1
1.0989010989011
0
0
32
35.1648351648352
91
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
561
Posted
3/22/2021 8:28:37 PM
Has anyone had any experience creating a corpus of tweets with a keyword? My programming skills are very limited but python seems to be the only way to do this properly? I did use Octoparse for a bit but it was a little limited...
I thought I would ask here in case there's someone who has been in the same boat as me and has any advice??
Thanks in advance
max5ez
linguistics
elles_bells_
t3_max5ez
https://www.reddit.com/r/linguistics/comments/max5ez/creating_a_twitter_corpus/
3/22/2021 8:28:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Creating a twitter corpus?
False
1
max5ez
0
4
17
17
1
1.42857142857143
2
2.85714285714286
0
0
25
35.7142857142857
70
128, 128, 128
3
Solid
50
No
446
Commented
12/9/2022 12:10:39 PM
Don't know octoparse, but one other way to do it is to access the node through xpath with the "contains" command (so here contains -> Impressum). With that you lead your scraper to that node and extract this link.
iziq8km
webscraping
sturmsignal
t1_iziq8km
https://www.reddit.com/r/webscraping/comments/wy9nkn/how_do_i_use_regex_to_find_impressum_page_links/iziq8km/
12/9/2022 12:10:39 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
wy9nkn
t3_wy9nkn
wy9nkn
0
wy9nkn
False
False
False
0
1
40
40
1
2.63157894736842
0
0
0
0
15
39.4736842105263
38
128, 128, 128
3
Solid
50
Yes
445
Commented
8/26/2022 3:26:43 PM
I don't use octoparse, but in any language there will be a function that returns the index of a string within a string, or a boolean of whether or not a string is contained in a string --> regex probably isn't needed.
If you can get all the links (all the <a> tags), then loop through them and see if the href attribute contains impressum, you should be able to get it.
Some C# for example
var theImpressumLink = "";
foreach (var a in aLinks)
{
var href = a.GetAtrributeValue("href", "");
if (href.Contains("impressum")) theImpressumLink = href;
}
ilvpy03
webscraping
duhhuh
t1_ilvpy03
https://www.reddit.com/r/webscraping/comments/wy9nkn/how_do_i_use_regex_to_find_impressum_page_links/ilvpy03/
8/26/2022 3:26:43 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
wy9nkn
t3_wy9nkn
wy9nkn
1
wy9nkn
False
False
False
0
1
40
40
0
0
0
0
0
0
40
43.010752688172
93
128, 128, 128
3
Solid
50
Yes
444
RepliedTo
8/26/2022 10:27:19 PM
Sensible approach, I just like octoparse better for convenience really, makes things faster. But I guess some jobs need to be done by yourself.
ilxh4r6
webscraping
wannakeepmyanonymity
t1_ilxh4r6
https://www.reddit.com/r/webscraping/comments/wy9nkn/how_do_i_use_regex_to_find_impressum_page_links/ilxh4r6/
8/26/2022 10:27:19 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ilvpy03
t1_ilvpy03
ilvpy03
0
wy9nkn
True
False
False
1
1
40
40
4
16.6666666666667
0
0
0
0
10
41.6666666666667
24
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
No
449
Posted
8/26/2022 1:46:18 PM
I am trying to filter through the html of homepages to find the link for the impressum on german websites. An impressum is a legal requirement for websites that needs to include contact details, which obviously are my target.
Unfortunately not all impressum pages are under [www.example.com/impressum](http://www.example.com/impressum) so I need to find the link, to copy and open it to extract my data. Not really familiar with regex and what expression I need to use. For convenience I just use octoparse for now. (not sure if that makes a difference for regex, as I said: Not familiar with it).
Does anyone know what the expression has to look like to find the links that include "impressum"?
wy9nkn
webscraping
wannakeepmyanonymity
t3_wy9nkn
https://www.reddit.com/r/webscraping/comments/wy9nkn/how_do_i_use_regex_to_find_impressum_page_links/
8/26/2022 1:46:18 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How do I use regex to find impressum page links?
False
1
wy9nkn
0
9
40
40
1
0.806451612903226
1
0.806451612903226
0
0
53
42.741935483871
124
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
No
448
Posted
8/18/2022 1:40:43 PM
I have the issue that I want to collect business information from google maps, but I fail to find the xpath correctly? I click 2-3 links until they are all green and then select the "loop click through them all" option.
Works fine in theory, but when I try to run it, I get nill results. Going back at it I find the error "can't find xpath" for each looped item. I also try to paginate the result pages, but the issue is from the very beginning.
I need to open the entries so I can extract the phone number, address and website url.
Another issue is that, when I got it running somehow, I only could collect half of the data sometimes because for some reason Octoparse couldn't find the URL although it's all under the same button? I know, weird.
Anyone experienced with octoparse? I thought it would make it a lot easier by just clicking at stuff, but it's still a learning curve.
wrjvya
webscraping
wannakeepmyanonymity
t3_wrjvya
https://www.reddit.com/r/webscraping/comments/wrjvya/octoparse_and_google_maps/
8/18/2022 1:40:43 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse and Google Maps
False
1
wrjvya
0
9
40
40
4
2.39520958083832
6
3.59281437125748
0
0
69
41.3173652694611
167
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
No
447
Posted
8/20/2022 5:23:46 PM
I set everything up but the issue is that when I try to run it that there is a problem saying "can't find the xpath".
Basically the procedure is this: I open the google maps, then loop click each entry to open up the information and try to scrape it.
But octoparse can't find the xpath of the entries, especially after paginating. What do I have to do here? The customer support isn't open until monday, their response so far also wasn't really helpful as they said I should use their pre-built scraper, which doesn't take the information I need as it doesn't click the entries.
How do I tell octoparse what to click?
wtcicu
webscraping
wannakeepmyanonymity
t3_wtcicu
https://www.reddit.com/r/webscraping/comments/wtcicu/octoparse_help_how_to_i_get_address_phone_number/
8/20/2022 5:23:46 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse help? How to I get address, phone number and website from google maps?
False
1
wtcicu
0
9
40
40
2
1.73913043478261
2
1.73913043478261
0
0
45
39.1304347826087
115
128, 128, 128
3
Solid
50
Yes
443
Commented
5/30/2022 10:17:25 AM
This looks very useful but unfortunately I can't access this webpage
iaj2kla
Octoparse_ideas
EugeneQuah
t1_iaj2kla
https://www.reddit.com/r/Octoparse_ideas/comments/v0w3f5/how_to_extract_data_from_pdf_to_excel_without/iaj2kla/
5/30/2022 10:17:25 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
v0w3f5
t3_v0w3f5
v0w3f5
1
v0w3f5
False
False
False
0
1
3
3
1
9.09090909090909
1
9.09090909090909
0
0
4
36.3636363636364
11
128, 128, 128
3
Solid
50
Yes
442
RepliedTo
5/31/2022 1:15:05 AM
Here's the link: [https://www.octoparse.com/blog/how-to-extract-pdf-into-excel?utm\_source=sale2022&utm\_medium=extractdatafrompdf&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-extract-pdf-into-excel?utm_source=sale2022&utm_medium=extractdatafrompdf&utm_campaign=reddit)
iam00d0
Octoparse_ideas
Octoparseideas
t1_iam00d0
https://www.reddit.com/r/Octoparse_ideas/comments/v0w3f5/how_to_extract_data_from_pdf_to_excel_without/iam00d0/
5/31/2022 1:15:05 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iaj2kla
t1_iaj2kla
iaj2kla
0
v0w3f5
True
False
False
1
1
3
3
2
5
0
0
0
0
25
62.5
40
128, 128, 128
3
Solid
50
No
458
RepliedTo
4/24/2023 2:12:02 AM
I had some issues with PHP where it only supports xpath 1.
Forgot exact test case, but in xpath1 you had to traverse all nodes to find the one you looking for, and in xpath2 you could select that node specifically ( like when you have a CSS selector for an href with a certain url)
jhgy8w1
webscraping
Annh1234
t1_jhgy8w1
https://www.reddit.com/r/webscraping/comments/12ww6yh/gui_scrapers_that_support_xpath_20/jhgy8w1/
4/24/2023 2:12:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jhgxaqm
t1_jhgxaqm
jhgxaqm
0
12ww6yh
False
False
False
1
1
5
5
1
1.81818181818182
1
1.81818181818182
0
0
21
38.1818181818182
55
128, 128, 128
3
Solid
50
No
438
RepliedTo
5/1/2020 1:43:58 AM
Do this, else its a ton of work.
Basically if you google "site:site1.com keyword", you get results of site1.com has keyword indexed for it.
If it doesn't, it's cause it's not allowed by robots.txt
fp4eonp
webscraping
Annh1234
t1_fp4eonp
https://www.reddit.com/r/webscraping/comments/g9n330/help_with_hopefully_simple_unstructured_web/fp4eonp/
5/1/2020 1:43:58 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fow4ddu
t1_fow4ddu
fow4ddu
0
g9n330
False
False
False
1
1
5
5
1
2.63157894736842
0
0
0
0
14
36.8421052631579
38
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
441
Posted
3/1/2021 6:30:45 AM
Couldn't get a Twitter API account, so trying Octoparse for now.
Any idea how to get a Twitter URL extracted?
lv22w3
webscraping
LiveEhLearn
t3_lv22w3
https://www.reddit.com/r/webscraping/comments/lv22w3/octoparse_twittter_url/
3/1/2021 6:30:45 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse - Twittter URL
False
1
lv22w3
0
2
5
5
0
0
0
0
0
0
9
45
20
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
440
Commented
4/2/2021 1:50:51 AM
Thanks everyone. Realized it was because of the type of user I selected during the Twitter API application. Found a local to help me out, but may DM you in the future to pick some minds.
Now if I could figure out how to re-apply for the Twitter API from the same Twitter @ handle...
gt3krfk
webscraping
LiveEhLearn
t1_gt3krfk
https://www.reddit.com/r/webscraping/comments/lv22w3/octoparse_twittter_url/gt3krfk/
4/2/2021 1:50:51 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lv22w3
t3_lv22w3
lv22w3
0
lv22w3
True
False
False
0
2
5
5
0
0
0
0
0
0
25
45.4545454545455
55
128, 128, 128
3
Solid
50
No
439
Commented
3/2/2021 5:30:31 AM
It's not easy to scrape Twitter without using their official API --- but it is easy to get your IP banned! Send me a DM with what you're trying to scrape and I'll see if I can help you out
gpdwtcq
webscraping
matty_fu
t1_gpdwtcq
https://www.reddit.com/r/webscraping/comments/lv22w3/octoparse_twittter_url/gpdwtcq/
3/2/2021 5:30:31 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:1 Gold:0 Platinum:0 Count:1
False
False
lv22w3
t3_lv22w3
lv22w3
0
lv22w3
False
False
False
0
1
5
5
2
5.12820512820513
0
0
0
0
14
35.8974358974359
39
128, 128, 128
3
Solid
50
No
436
Commented
3/1/2021 7:08:57 AM
depending on what you need to extract you might not need octaparse. happy to help...dm me.
gp9z0ex
webscraping
oukaili80
t1_gp9z0ex
https://www.reddit.com/r/webscraping/comments/lv22w3/octoparse_twittter_url/gp9z0ex/
3/1/2021 7:08:57 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lv22w3
t3_lv22w3
lv22w3
0
lv22w3
False
False
False
0
1
5
5
1
5.88235294117647
0
0
0
0
7
41.1764705882353
17
128, 128, 128
3
Solid
50
No
431
Commented
5/17/2021 7:09:34 PM
Agree that these are epoch timestamps in milliseconds rather than seconds.
For example, `1620945297000` is the equivalent of `Thu May 13 2021 22:34:57` in UTC.
gyha9fq
learnprogramming
insertAlias
t1_gyha9fq
https://www.reddit.com/r/learnprogramming/comments/neo7tb/making_sense_of_octoparses_timestamps/gyha9fq/
5/17/2021 7:09:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
neo7tb
t3_neo7tb
neo7tb
0
neo7tb
False
False
False
0
1
2
2
0
0
0
0
0
0
15
55.5555555555556
27
128, 128, 128
3
Solid
50
No
430
Commented
5/17/2021 7:05:27 PM
Looks like unix like time stamp although with milliseconds, convert to date and time for example in https://currentmillis.com/
If you want to figure it out by hand it's milliseconds since 00:00:00 UTC on 1 January 1970.
gyh9o9n
learnprogramming
gramdel
t1_gyh9o9n
https://www.reddit.com/r/learnprogramming/comments/neo7tb/making_sense_of_octoparses_timestamps/gyh9o9n/
5/17/2021 7:05:27 PM
5/17/2021 7:11:50 PM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
neo7tb
t3_neo7tb
neo7tb
0
neo7tb
False
False
False
0
1
2
2
0
0
0
0
0
0
19
51.3513513513514
37
128, 128, 128
3
Solid
50
No
429
Posted
12/20/2018 9:58:34 PM
*Im not sure If this is the place for this, but i dont knwo where else to put it.*
A large trend in web development has been in making cites more accessible easier to use, this is a good thing. But I feel there is a lot more to be done for allowing people who are willing to learn new tools to effectively exploit the internet and web.
**Hear is the pitch**:
A browser with built in web-crawling capabilities, allowing the user to easily explore, not just the current cite, but also the surrounding internet.
&#x200B;
**Use case:**
Your on a cite, with loads of links to buy things, but you only want to spend in GBP, not USD. Search all the links on the page for a set of key words like {GBP, Buy, ect...}
&#x200B;
**Use case:**
Your on an old cite with loads of links, but your not sure which ones are dead. So search all the links on the page, verifying which ones are up, and ranking them in how recently they have been created.
&#x200B;
**Use case:**
You found a nice pdf file. You want to see what else is on this cite, so you search the domain, following all the internal links you can and trying a table of common sub domains.
&#x200B;
**Use case:**
You find a cool cite, but your worded its going to go down, so you quickly save the whole thing to disk.
&#x200B;
**Use case:**
Find an edgy cite, not sure how edgy. So you run check out what this cite links too and what links too it. Auto looks up the Whois record, sees what else that person does. See what else is hosted from that domain, and server. Check out how new the cite is. Check out how much traffic to it. Summarise all the information
&#x200B;
I Imagen something based off [REBL](https://github.com/cognitect-labs/REBL-distro).
&#x200B;
Maby Im being supped. Maby something like this already exists. Maby this is infesable. Idk, I'm a pleb but I would use this if it existed.
I did look for web scraper plugins. I found [this list](https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss). However non of them hit the spot, they seem to do some but not all of what i want, and non of them seemed that exstensible.
&#x200B;
What do you guys think? How stupid and Kruger am I being?
&#x200B;
a830pi
ProgrammingDiscussion
SameAgainTheSecond
t3_a830pi
https://www.reddit.com/r/ProgrammingDiscussion/comments/a830pi/a_browser_for_power_user/
12/20/2018 9:58:34 PM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
A Browser for power user
False
0.67
a830pi
0
1
1
1
8
1.88679245283019
4
0.943396226415094
0
0
183
43.1603773584906
424
128, 128, 128
3
Solid
50
No
428
Posted
9/22/2017 3:35:44 PM
Coupon: OCTOPARSE. Here's the link: http://www.gearbest.com/mount-holder/pp_162742.html?wid=21&lkid=11026666
71rrm2
couponsfromchina
r3crac
t3_71rrm2
https://www.reddit.com/r/couponsfromchina/comments/71rrm2/probably_australia_only_smartphone_flexible/
9/22/2017 3:35:44 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
[Probably Australia only!] Smartphone Flexible Tripod for 0.55 USD
False
1
71rrm2
0
1
1
1
0
0
0
0
0
0
4
80
5
128, 128, 128
3
Solid
50
No
427
Posted
4/30/2020 6:28:05 PM
Hey everyone,
New to this. Want to scrap a list of company websites I have from Indie Hacker, but the problem is Octoparse shows me the in b/w loading screen which has a random quote, instead of the actual page.
Does anyone know how to fix this?
Thanks
gb19yp
scrapinghub
Bruce_wayne89
t3_gb19yp
https://www.reddit.com/r/scrapinghub/comments/gb19yp/how_to_scrap_indie_hacker_via_octoparse/
4/30/2020 6:28:05 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrap Indie hacker via Octoparse
False
0.5
gb19yp
0
1
1
1
0
0
2
4.08163265306122
0
0
19
38.7755102040816
49
128, 128, 128
3
Solid
50
No
426
Posted
4/1/2021 12:26:25 PM
How would you scrap the date from https://www.kexp.org/playlist/#? The date doesn't update when you select the earlier button on the bottom. The url doesn't change either. Is there a unique page identifier not shown? I look at the page code and couldn't find any differences between the pages. Thank you.
mhu6kq
webscraping
shocka_locka
t3_mhu6kq
https://www.reddit.com/r/webscraping/comments/mhu6kq/scrape_date_from_website_playlist_with_octoparse/
4/1/2021 12:26:25 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape date from website playlist with Octoparse?
False
1
mhu6kq
0
1
1
1
1
2.04081632653061
1
2.04081632653061
0
0
17
34.6938775510204
49
128, 128, 128
3
Solid
50
No
425
Posted
12/26/2022 10:12:21 AM
Looks like the times of old-school manual info collection have passed. Today we see numerous websites written by AI. When I used to work in an office 15 years ago, we collected and researched the information manually. We created different spreadsheets in Excel and worked with this information to sort and analyze it properly. I still remember having several A4 sheets that contained Excel formulas that I used and failed to remember. If you think that I am not smart enough and that the formulas are limited to SUM, see the list of formulas in the [official Microsoft guide](https://support.microsoft.com/en-us/office/overview-of-formulas-in-excel-ecfdc708-9162-49e8-b993-c311f47ca173). I just opened up the link to share with you folks, and thought, damn, why there was no place with all links listed in one place? I had to visit courses to master Excel details for accounting specialists.
Anyways, everything's changed a lot since then. A friend of mine runs a small business. He and his wife sell stuff on Amazon. They order different things from China manufacturers and sell them to US customers through [amazon.com](https://amazon.com). I once came to see them and we had a small talk about their businesses. I know, that Amazon is a huge marketplace, and it is hard to rank well and keep the business running. So, I asked them, how they watch their competitors and stuff. And you know what they told me? They scrap Amazon. They do not visit the shops of their competitors to see, what holiday discounts they have and what prices they offer. They have special software that does this for them. And, this smart scraper not only collects the data they need but also creates reports in spreadsheets for them. So, they save time.
You know, it was something totally new to me, because I am older and I know what data collection was like 15-18 years ago. This was the moment when I realized that the future is already here. I failed to remember, what kind of scraping software they use because the technical aspect was quite complicated for me. But as far as I understood, this is a custom solution. If you are interested in this topic, just google 'web scraping' or see the websites of the companies that offer these services. Here are some good examples for you: [Parsehub](https://www.parsehub.com/), [Forlake](https://forlake.io/), [Octoparse](https://www.octoparse.com/). I think, that today is a good time to start a business because you no longer need to hire people to do data collection for your reports. Today there are technical solutions that can do this for you.
zvk7zc
u_Herr_Major
Herr_Major
t3_zvk7zc
https://www.reddit.com/r/u_Herr_Major/comments/zvk7zc/a_couple_of_words_about_web_scraping/
12/26/2022 10:12:21 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
A couple of words about web scraping
False
1
zvk7zc
0
1
1
1
15
3.28947368421053
7
1.53508771929825
0
0
194
42.5438596491228
456
128, 128, 128
3
Solid
50
No
424
Posted
5/3/2021 12:37:45 AM
I'm not aware if this post violates any community guidelines, if so sorry Mods I couldn't find any community rules.
Will be scraping pricing & other information from \~6,090+/- websites (around 1,000+ data points/per website) & storing the data in JSON files which in which the data & file structure will be pre-outlined for you. Am willing to provide/pay for servers or platforms (Linode, Digital ocean, octoparse ...etc...) for you to use to assist in the web scraping.
We will have the links compiled in a spreadsheet or JSON file (which ever you prefer).
We are willing to pay cash (PayPal, bank transfer ...etc...) or joining our company (if we feel you bring the proper amount of value). Am willing to pay a fair price & not just go with the lowest price as our business is focused on our data.
I am the lead developer & majority shareholder. I would do it in-house but with the creation of the webapp & app we don't have enough time.
&#x200B;
If anyone is interested you can PM me or email me at [jack@potp.io](mailto:jack@potp.io)
n3jy39
webscraping
jackphumphrey
t3_n3jy39
https://www.reddit.com/r/webscraping/comments/n3jy39/iso_web_scraper_for_hire/
5/3/2021 12:37:45 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ISO web scraper for hire
False
1
n3jy39
0
1
2
2
8
4.1025641025641
1
0.512820512820513
0
0
85
43.5897435897436
195
128, 128, 128
3
Solid
50
No
423
Commented
5/6/2021 5:46:53 AM
Hi There,
Your data requirement looks huge and seems like a frequent exercise. You may certainly opt to scrape web data manually if you have programming skills. But before you do that, I would like to bring to you some major issues with DIY techniques. Some of them are:
i) scraper getting blocked now and then
ii) legality (scraping policies of the websites which can attract legal troubles).
To avoid these hurdles, you may opt for scraping tools, but if you have custom requirements -and need to scrape data regularly, then opting for a web scraping service could be the better way to go forward. A data scraping service provider is more equipped to handle more complex, customised and huge scraping requirement because of its:
* Experience in different domain
* Infrastructure
* Established workflow
* Team
If you would like to know more about the difference between a web scraping service and a tool, here is a link to help you with it.
Link:[ https://www.promptcloud.com/blog/web-scraping-tool-vs-web-scraping-services/](https://www.promptcloud.com/blog/web-scraping-tool-vs-web-scraping-services/).
Hope this helps.
gx4c1vg
webscraping
promptcloud
t1_gx4c1vg
https://www.reddit.com/r/webscraping/comments/n3jy39/iso_web_scraper_for_hire/gx4c1vg/
5/6/2021 5:46:53 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
n3jy39
t3_n3jy39
n3jy39
0
n3jy39
False
False
False
0
1
2
2
1
0.609756097560976
3
1.82926829268293
0
0
81
49.390243902439
164
128, 128, 128
3
Solid
50
No
422
Commented
5/3/2021 8:28:47 AM
Is the 6,000 websites with 1,000 data points a one off task or is obtaining this data an ongoing requirement?
Are the 6,000 websites all identical or is there variation to how the data is presented?
Do you know if any of these websites have captcha?
What is the timeframe and budget for this project?
gwrde5e
webscraping
makedatauseful
t1_gwrde5e
https://www.reddit.com/r/webscraping/comments/n3jy39/iso_web_scraper_for_hire/gwrde5e/
5/3/2021 8:28:47 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
n3jy39
t3_n3jy39
n3jy39
1
n3jy39
False
False
False
0
1
2
2
0
0
0
0
0
0
23
39.6551724137931
58
128, 128, 128
3
Solid
50
No
421
RepliedTo
5/3/2021 1:21:17 PM
I agree these are good questions, in addition I was wondering if you would consider scraping these 6000 sites a 1 person task or if there will be a team.
gws311n
webscraping
TheElectricSlide2
t1_gws311n
https://www.reddit.com/r/webscraping/comments/n3jy39/iso_web_scraper_for_hire/gws311n/
5/3/2021 1:21:17 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gwrde5e
t1_gwrde5e
gwrde5e
0
n3jy39
False
False
False
1
1
2
2
1
3.33333333333333
0
0
0
0
11
36.6666666666667
30
128, 128, 128
3
Solid
50
No
420
Posted
8/27/2020 10:18:38 AM
Hey there! I'm new in the community, so hi to everyone!
Well, the fact is that I'm creating my own website, and I'm trying to pick up the products from my old renting website.
But I don't know why, the software I'm using (Octoparse) doesen't complete the task putting the correct workflow...
I already contacted the program support, but they want a 250€ payment for set up a correct workflow... and that's too much for me at this moment..
&#x200B;
I think that the problem is that the page block up the crawler.. it can be...
If you have some time to try this out and see what happens i'll be very grateful.
The page:
[https://ofiart.es/](https://ofiart.es/)
The software:
[https://www.octoparse.com/download](https://www.octoparse.com/download)
&#x200B;
&#x200B;
Thanks!!
ihihef
webscraping
Lilforeskin8
t3_ihihef
https://www.reddit.com/r/webscraping/comments/ihihef/issues_crawling_a_website_with_octoparse_free/
8/27/2020 10:18:38 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Issues crawling a website with Octoparse free account
False
1
ihihef
0
1
1
1
5
3.59712230215827
1
0.719424460431655
0
0
50
35.9712230215827
139
128, 128, 128
3
Solid
50
No
418
Commented
8/3/2021 12:17:35 PM
Hi There,
It is technically possible to [scrape data from Amazon ASIN](https://www.promptcloud.com/get-data-from-amazon-asin/). I work for PromptCloud, a web scraping service provider, and we have successfully worked on similar projects.
Since you have already tried the web scraping tool, you check the other two remaining approaches:
* You can do manual scraping using programming languages such as python or ROR.
* You can opt for website scraping service providers for more customised scraping requirements
If web scraping tools and services sound confusing, here is a link to help you differentiate between a web scraping service and a tool.
Link: [https://www.promptcloud.com/blog/web-scraping-tool-vs-web-scraping-services/](https://www.promptcloud.com/blog/web-scraping-tool-vs-web-scraping-services/).
Hope this helps.
h7jm0la
webscraping
promptcloud
t1_h7jm0la
https://www.reddit.com/r/webscraping/comments/oq6z3b/has_anyone_successfully_scraped_amazon_product/h7jm0la/
8/3/2021 12:17:35 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
oq6z3b
t3_oq6z3b
oq6z3b
0
oq6z3b
False
False
False
0
1
2
2
3
2.27272727272727
1
0.757575757575758
0
0
78
59.0909090909091
132
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
551
Commented
5/18/2021 8:03:48 AM
hi, have you tried this company [https://webautomation.io/pde/google-maps-web-scraper-now-extract-business-data-with-ease/217/](https://webautomation.io/pde/google-maps-web-scraper-now-extract-business-data-with-ease/217/)
gyjnewz
learnprogramming
VictorAVB
t1_gyjnewz
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/gyjnewz/
5/18/2021 8:03:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
nezi7k
t3_nezi7k
nezi7k
1
nezi7k
False
False
False
0
4
2
2
2
5.55555555555556
0
0
0
0
25
69.4444444444444
36
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
550
RepliedTo
5/18/2021 11:56:15 AM
No because they don't offer free service and only allow trial which is not worth the time compared to Free Octoparse.
gyk48b5
learnprogramming
kartikoli
t1_gyk48b5
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/gyk48b5/
5/18/2021 11:56:15 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gyjnewz
t1_gyjnewz
gyjnewz
1
nezi7k
True
False
False
1
4
2
2
3
14.2857142857143
0
0
0
0
7
33.3333333333333
21
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
549
Commented
5/18/2021 8:03:48 AM
hi, have you tried this company [https://webautomation.io/pde/google-maps-web-scraper-now-extract-business-data-with-ease/217/](https://webautomation.io/pde/google-maps-web-scraper-now-extract-business-data-with-ease/217/)
gyjnewz
learnprogramming
VictorAVB
t1_gyjnewz
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/gyjnewz/
5/18/2021 8:03:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
nezi7k
t3_nezi7k
nezi7k
1
nezi7k
False
False
False
0
4
2
2
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
548
RepliedTo
5/18/2021 11:56:15 AM
No because they don't offer free service and only allow trial which is not worth the time compared to Free Octoparse.
gyk48b5
learnprogramming
kartikoli
t1_gyk48b5
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/gyk48b5/
5/18/2021 11:56:15 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gyjnewz
t1_gyjnewz
gyjnewz
1
nezi7k
True
False
False
1
4
2
2
128, 128, 128
3
Solid
50
No
432
Commented
5/17/2021 6:47:55 PM
Hi, this is a unix timestamp. To convert to date in excel use this formula = (A1-Date (1970,1,1))
gyh754z
webscraping
VictorAVB
t1_gyh754z
https://www.reddit.com/r/webscraping/comments/neo5t3/making_sense_of_octoparse_timestamps/gyh754z/
5/17/2021 6:47:55 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
neo5t3
t3_neo5t3
neo5t3
0
neo5t3
False
False
False
0
1
2
2
1
5
0
0
0
0
10
50
20
128, 128, 128
3
Solid
50
Yes
417
Commented
7/23/2021 10:01:46 PM
I use https://webautomation.io/pde/amazon-department-product-scraper/80/
h6asrm0
webscraping
VictorAVB
t1_h6asrm0
https://www.reddit.com/r/webscraping/comments/oq6z3b/has_anyone_successfully_scraped_amazon_product/h6asrm0/
7/23/2021 10:01:46 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
oq6z3b
t3_oq6z3b
oq6z3b
1
oq6z3b
False
False
False
0
1
2
2
0
0
0
0
0
0
1
50
2
128, 128, 128
3
Solid
50
Yes
416
RepliedTo
7/23/2021 10:57:34 PM
Sweet I will give this a try
h6azi1q
webscraping
jacksonsmomma06
t1_h6azi1q
https://www.reddit.com/r/webscraping/comments/oq6z3b/has_anyone_successfully_scraped_amazon_product/h6azi1q/
7/23/2021 10:57:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h6asrm0
t1_h6asrm0
h6asrm0
0
oq6z3b
True
False
False
1
1
2
2
1
14.2857142857143
0
0
0
0
2
28.5714285714286
7
128, 128, 128
3
Solid
50
No
415
Commented
3/4/2022 8:47:31 AM
Hi, you can try this scraping tool [https://www.scrapestorm.com/](https://www.scrapestorm.com/)
hzandp4
webscraping
scrapestorm
t1_hzandp4
https://www.reddit.com/r/webscraping/comments/oq6z3b/has_anyone_successfully_scraped_amazon_product/hzandp4/
3/4/2022 8:47:31 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
oq6z3b
t3_oq6z3b
oq6z3b
0
oq6z3b
False
False
False
0
1
2
2
0
0
0
0
0
0
6
40
15
128, 128, 128
3
Solid
50
No
414
Commented
4/1/2022 8:29:15 AM
Thanks for sharing! You can try ScrapeStorm, it is also a great scraping tool.
i2yk0nq
webscraping
scrapestorm
t1_i2yk0nq
https://www.reddit.com/r/webscraping/comments/o6o8g7/top_5_scraping_tools_for_beginners_importio_vs/i2yk0nq/
4/1/2022 8:29:15 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o6o8g7
t3_o6o8g7
o6o8g7
0
o6o8g7
False
False
False
0
1
2
2
1
7.14285714285714
0
0
0
0
6
42.8571428571429
14
128, 128, 128
3
Solid
50
No
413
Commented
7/24/2021 6:56:23 AM
I can recommend this service: https://outscraper.com/amazon-scraper/
Although, it’s main focus google services, it has Amazon scraper also.
h6cb660
webscraping
Dana_OS
t1_h6cb660
https://www.reddit.com/r/webscraping/comments/oq6z3b/has_anyone_successfully_scraped_amazon_product/h6cb660/
7/24/2021 6:56:23 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
oq6z3b
t3_oq6z3b
oq6z3b
0
oq6z3b
False
False
False
0
1
2
2
1
5.88235294117647
0
0
0
0
8
47.0588235294118
17
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
412
RepliedTo
7/14/2021 3:56:19 PM
How many items do you need for free?
h560hev
learnprogramming
Dana_OS
t1_h560hev
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/h560hev/
7/14/2021 3:56:19 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gyk48b5
t1_gyk48b5
gyk48b5
0
nezi7k
False
False
False
2
4
2
2
1
12.5
0
0
0
0
3
37.5
8
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
411
RepliedTo
7/14/2021 3:56:19 PM
How many items do you need for free?
h560hev
learnprogramming
Dana_OS
t1_h560hev
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/h560hev/
7/14/2021 3:56:19 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gyk48b5
t1_gyk48b5
gyk48b5
0
nezi7k
False
False
False
2
4
2
2
128, 128, 128
3
Solid
50
No
419
Posted
7/23/2021 5:16:44 PM
So far I’ve tried using Octoparse but I can’t get the workflow selecting the correct product page and extract the text data from there. It only wants to work on the search page.
Any tips or better programs to do this you could please suggest?
oq6z3b
webscraping
jacksonsmomma06
t3_oq6z3b
https://www.reddit.com/r/webscraping/comments/oq6z3b/has_anyone_successfully_scraped_amazon_product/
7/23/2021 5:16:44 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Has anyone successfully scraped Amazon Product Info for a list of ASIN’s?
False
1
oq6z3b
0
1
2
2
3
6.38297872340426
0
0
0
0
16
34.0425531914894
47
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
410
Commented
7/23/2021 6:14:43 PM
I used Python Scrapy to scrape amazon data using ASIN lists.
h69yqr6
webscraping
hellish_reader
t1_h69yqr6
https://www.reddit.com/r/webscraping/comments/oq6z3b/has_anyone_successfully_scraped_amazon_product/h69yqr6/
7/23/2021 6:14:43 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
oq6z3b
t3_oq6z3b
oq6z3b
1
oq6z3b
False
False
False
0
2
2
2
0
0
0
0
0
0
9
81.8181818181818
11
128, 128, 128
3
Solid
50
Yes
409
RepliedTo
7/23/2021 9:19:04 PM
Thank you
h6ancfi
webscraping
jacksonsmomma06
t1_h6ancfi
https://www.reddit.com/r/webscraping/comments/oq6z3b/has_anyone_successfully_scraped_amazon_product/h6ancfi/
7/23/2021 9:19:04 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h69yqr6
t1_h69yqr6
h69yqr6
1
oq6z3b
True
False
False
1
1
2
2
1
50
0
0
0
0
0
0
2
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
408
RepliedTo
7/24/2021 2:44:57 AM
Welcome
h6bp15h
webscraping
hellish_reader
t1_h6bp15h
https://www.reddit.com/r/webscraping/comments/oq6z3b/has_anyone_successfully_scraped_amazon_product/h6bp15h/
7/24/2021 2:44:57 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h6ancfi
t1_h6ancfi
h6ancfi
0
oq6z3b
False
False
False
2
2
2
2
1
100
0
0
0
0
0
0
1
128, 128, 128
3
Solid
50
No
407
Commented
10/12/2020 8:41:34 PM
Buen Tema, saludos
g8ml0j5
webscraping
josemontano
t1_g8ml0j5
https://www.reddit.com/r/webscraping/comments/gzjbkl/10_malentendidos_sobre_el_web_scraping/g8ml0j5/
10/12/2020 8:41:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gzjbkl
t3_gzjbkl
gzjbkl
0
gzjbkl
False
False
False
0
1
5
5
0
0
0
0
0
0
3
100
3
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
406
Posted
4/19/2023 11:43:25 AM
https://v.redd.it/mhngnx5pvtua1
12rqeoo
webscraping
dedpul218
t3_12rqeoo
https://www.reddit.com/r/webscraping/comments/12rqeoo/can_someone_please_tell_me_what_im_doing_wrong_i/
4/19/2023 11:43:25 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Can someone please tell me what I'm doing wrong. I am using Octoparse.
False
0.5
12rqeoo
0
2
1
1
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
405
Commented
4/19/2023 11:44:11 AM
[https://www.swiggy.com/city/bangalore](https://www.swiggy.com/city/bangalore) This is the webpage that I'm trying to scrape
jgv9smi
webscraping
dedpul218
t1_jgv9smi
https://www.reddit.com/r/webscraping/comments/12rqeoo/can_someone_please_tell_me_what_im_doing_wrong_i/jgv9smi/
4/19/2023 11:44:11 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
12rqeoo
t3_12rqeoo
12rqeoo
0
12rqeoo
True
False
False
0
2
1
1
0
0
0
0
0
0
9
42.8571428571429
21
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
404
Commented
9/14/2021 2:35:46 PM
Gracias Melisa por el buen articulo! Has pensado en hacer video en YouTube?
hctoy9x
webscraping
clxyder
t1_hctoy9x
https://www.reddit.com/r/webscraping/comments/pnym31/tripadvisor_scraper_los_principales_destinos/hctoy9x/
9/14/2021 2:35:46 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pnym31
t3_pnym31
pnym31
1
pnym31
False
False
False
0
4
5
5
0
0
0
0
0
0
8
61.5384615384615
13
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
403
RepliedTo
9/15/2021 9:29:55 AM
Muchas gracias por tu comentario! Sí en el futuro cercano hacemos video en YouTube sobre este tema! y si te interesa, podrías dar atención a nuestro canal en YouTube:https://www.youtube.com/channel/UCY0zk3opGl15B4GBUAaOVLg
hcxitvw
webscraping
melisaxinyue
t1_hcxitvw
https://www.reddit.com/r/webscraping/comments/pnym31/tripadvisor_scraper_los_principales_destinos/hcxitvw/
9/15/2021 9:29:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hctoy9x
t1_hctoy9x
hctoy9x
0
pnym31
True
False
False
1
4
5
5
0
0
0
0
0
0
19
54.2857142857143
35
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
402
Commented
9/14/2021 2:35:46 PM
Gracias Melisa por el buen articulo! Has pensado en hacer video en YouTube?
hctoy9x
webscraping
clxyder
t1_hctoy9x
https://www.reddit.com/r/webscraping/comments/pnym31/tripadvisor_scraper_los_principales_destinos/hctoy9x/
9/14/2021 2:35:46 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pnym31
t3_pnym31
pnym31
1
pnym31
False
False
False
0
4
5
5
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
401
RepliedTo
9/15/2021 9:29:55 AM
Muchas gracias por tu comentario! Sí en el futuro cercano hacemos video en YouTube sobre este tema! y si te interesa, podrías dar atención a nuestro canal en YouTube:https://www.youtube.com/channel/UCY0zk3opGl15B4GBUAaOVLg
hcxitvw
webscraping
melisaxinyue
t1_hcxitvw
https://www.reddit.com/r/webscraping/comments/pnym31/tripadvisor_scraper_los_principales_destinos/hcxitvw/
9/15/2021 9:29:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hctoy9x
t1_hctoy9x
hctoy9x
0
pnym31
True
False
False
1
4
5
5
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
399
Commented
12/21/2022 8:48:47 AM
Seems like there's either an issue with your proxies OR your user agents. I would suggest changing the user agents list first, and if that doesn't solve the issue, try a new proxy provider.
Are you currently using datacenter proxies? I would suggest using either residential or mobile proxies instead(If simply changing the user agents list doesn't fix the issue).
j133mbi
webscraping
ProxyEmpire_io
t1_j133mbi
https://www.reddit.com/r/webscraping/comments/znghei/what_is_the_best_way_to_bypass_the_octoparse/j133mbi/
12/21/2022 8:48:47 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
znghei
t3_znghei
znghei
1
znghei
False
False
False
0
2
41
41
0
0
3
4.91803278688525
0
0
30
49.1803278688525
61
128, 128, 128
3
Solid
50
Yes
398
RepliedTo
12/21/2022 4:45:10 PM
I used proxy provide IPRoyal. Used residential proxy. And octoparse have it's own user agent, i did not find to change user agent. Not sure what is the problem.
j14inji
webscraping
Independent-Savings1
t1_j14inji
https://www.reddit.com/r/webscraping/comments/znghei/what_is_the_best_way_to_bypass_the_octoparse/j14inji/
12/21/2022 4:45:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j133mbi
t1_j133mbi
j133mbi
1
znghei
True
False
False
1
1
41
41
0
0
1
3.44827586206897
0
0
15
51.7241379310345
29
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
397
RepliedTo
12/21/2022 5:32:07 PM
Wanna give ProxyEmpire a try? Would be happy to give you a free trial, just PM me your e-mail. It might solve your issues :)
j14q0me
webscraping
ProxyEmpire_io
t1_j14q0me
https://www.reddit.com/r/webscraping/comments/znghei/what_is_the_best_way_to_bypass_the_octoparse/j14q0me/
12/21/2022 5:32:07 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j14inji
t1_j14inji
j14inji
0
znghei
False
False
False
2
2
41
41
2
8
1
4
0
0
9
36
25
128, 128, 128
3
Solid
50
No
400
Posted
12/16/2022 2:54:06 PM
It appears that this screen appears when I enter the Zillow website address in octoparse. Could you please let me know what to do about it?
I am using residential proxy and user agent rotation.
https://preview.redd.it/07q71xxrw96a1.png?width=779&format=png&auto=webp&v=enabled&s=d301ddca5856604ff75a1562f01c3ec9205504cd
znghei
webscraping
Independent-Savings1
t3_znghei
https://www.reddit.com/r/webscraping/comments/znghei/what_is_the_best_way_to_bypass_the_octoparse/
12/16/2022 2:54:06 PM
12/17/2022 1:05:36 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What is the best way to bypass the Octoparse human verification requirement?
False
1
znghei
0
1
41
41
0
0
0
0
0
0
16
45.7142857142857
35
128, 128, 128
3
Solid
50
No
396
Commented
12/16/2022 4:52:10 PM
This shouldn't happen with multiple proxies unless you are using few proxies that are already detected.
j0h7mhx
webscraping
web_scraping_corps
t1_j0h7mhx
https://www.reddit.com/r/webscraping/comments/znghei/what_is_the_best_way_to_bypass_the_octoparse/j0h7mhx/
12/16/2022 4:52:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
znghei
t3_znghei
znghei
0
znghei
False
False
False
0
1
41
41
0
0
0
0
0
0
9
56.25
16
128, 128, 128
3
Solid
50
No
395
Posted
12/28/2021 9:06:28 AM
&#x200B;
https://preview.redd.it/h6rcg11g19881.png?width=2240&format=png&auto=webp&v=enabled&s=3e3232c64919e2e6b668cec4445bdf552ae99991
Web scraping tools are software developed specifically to simplify the process of extracting data from websites. Data mining is a rather useful and commonly used process, but it can also easily turn into a complicated and messy activity and take a lot of time and effort.
**So what does a web scraper do?**
A web scraper uses robots to extract structured data and content from a website by extracting the underlying HTML code and data stored in a database.
In data mining, whether it’s preventing your IP address from being banned, crawling the original website properly, generating data in a compatible format, or cleaning up the data, many sub-processes are in progress. Fortunately, web scrapers and data scraping tools make this process simple, fast, and reliable.
Often, the online information to be retrieved is too large to be retrieved manually. This is why companies using web scraping tools can collect more data in less time and at a lower cost.
In addition, companies that profit from data scraping take a step forward in competing against competitors over the long term.
In this article, you will find a list of the top 13 best web scraping tools compared based on their features, price, and ease of use.
13 Best Web Scraping Tools Here’s a list of the best web scraping tools:
&#x200B;
1. Luminati (BrightData)
2. Scrapingdog
3. Newsdata.io
4. AvesAPI
5. ParseHub
6. Diffbot
7. Octoparse
8. ScrapingBee
9. Scrape.do
10. Grepsr
11. Scraper API
12. Scrapy
13. Import.io
The Web Scraper Tools search for new data either manually or automatically. They retrieve updated or new data and then archive it for easy access. These tools are useful for anyone trying to collect data on the Internet.
For example, web scraping tools can be used to collect real estate data, hotel data from major travel portals, products, pricing and review data for e-commerce websites, etc. . So basically if you are wondering ‘where can you scrape data’ these are data scraping tools.
Now let’s look at the list of the best web scratching tools in comparison to answer the question; which? the best web scraping tool?
## [1. Scrape.do](https://scrape.do/)
**Scrape.do** is an easy-to-use web scraper tool, which provides a scalable and fast web scraper proxy API to an endpoint. Based on affordability and functionality, Scrape.do top the list. As you will see in the rest of this article, Scrape.do is one of the cheapest web scraping tools on the market.
Unlike its competition, Scrape.do doesn’t charge any additional fees for Google and other hard-to-remove websites.
Offers the best value for money on the market for Google Scraping (SERP). (5,000,000 SERP for $ 249)
Additionally, Scrape.do has an average speed of 23 seconds to collect anonymous data from Instagram and a 99% success rate.
Its gateway speed is also 4 times that of its competitors.
In addition, this tool offers residential and mobile proxy access at half the cost.
Here are some of its other features.
**Features**
* Includes rotating proxies; they allow you to scratch any website Scrape.do rotates every request made to the API using its proxy pool.
* Unlimited bandwidth on all plans
* Fully customizable
* Billing only for successful requests
* Geo-targeting option for more than 10 countries
* JavaScript rendering that allows web pages that require JavaScript rendering to be scraped
* The super proxy setting allows you to ‘extract data from websites with central IP data protection.
**Pricing**
Pricing plans start at $ 29 / m. The Pro plan is $ 99 / m for 1,300,000 API calls.
## 2. Scrapingdog
[**Scrapingdog**](https://www.scrapingdog.com/) is a web scraping tool that simplifies the management of proxies, browsers, and CAPTCHAs. This tool provides the HTML data of any web page with a single API call. One of the best features of Scraping dog is that it also has a LinkedIn API. Here are some other important Scrapingdog features.
**Features**
* Rotate the IP address on every request and ignore any CAPTCHA for scraping without being blocked.
* JavaScript rendering
* Webhook
* Chrome headless
* Who is it for? Scrapingdog is for everyone who needs web scraping, from developers to non-developers.
**Pricing**
Pricing plans start at $ 20 / m. The JS rendering feature is available at least for the standard plan which is $ 90 / m. The LinkedIn API is only available for the pro plan ($ 200 / m.)
## 3. Newsdata.io
[**Newsdata.io**](https://newsdata.io/) is a Saas-based web tool that gives its users direct access to structured and real-time data by crawling a great deal of web news sources. It fetches news data from the most reliable news sources in the world in 30+ languages and from 50+ countries in 10+ categories.
Newsdata.io’s web news data scraping API can extract online discussions on forums and store the output data in a variety of formats, including JSON, XML, and RSS. It also has a disjointed data collection. The Newsdata.io news API can provide data with low latency but high coverage.
**Features**
* 3000+ news data sources
* Export the data in JSON, Excel, CSV
* Free news datasets
* Customized historical news data reports
**Pricing**
Newsdata.io pricing plans start from $49,99/ month to customized pricing plan option, they also offer a free plan for testing and non-commercial use.
## 4. AvesAPI
[**AvesAPI**](https://avesapi.com/) is a SERP API (Search Engine Results Page) tool that allows developers and agencies to extract structured data from Google search.
Unlike the other services on our list, AvesAPI has a strong focus on the data you are going to extract, rather than a larger web scrape. Hence, it is best for SEO tools and agencies as well as for marketing professionals.
This web scraper offers an intelligent distributed system that can easily extract millions of keywords. This means leaving aside the tedious workload of manually checking SERP results and avoiding CAPTCHAs.
**Features:**
* Get structured data in JSON or HTML in real-time
* Get top 100 results from any location and any language
* Geospecific search for local results
* Analyze product data on purchases
**Disadvantage:** Because this tool was created quite recently, it’s hard to tell what real users think of the product. However, what the product promises is still great to try it out for free and see for yourself.
**Pricing:** AvesAPI’s pricing is quite affordable compared to other web scraping tools. You can also try the service for free.
Paid plans start at $ 50 per month for 25,000 searches.
## 5. ParseHub
[**ParseHub**](https://www.parsehub.com/) is a free web scraping tool developed for online data mining. This tool comes in the form of a downloadable desktop application. It offers more features than most other scrapers eg you can scrape and upload images/files, upload CSV and JSON files, here is a list of its other features.
**Features**
* IP Rotation
* Cloud-based for automatic data archiving
* Scheduled collection (to collect data monthly, weekly, etc.)
* Regular expressions to clean up text and HTML before downloading data
* API and webhook for
* REST API integrations
* JSON and Excel format for downloads
* Get data from tables and maps
* Infinite scrolling pages
* Get data behind an access
**Pricing:** Yes, ParseHub offers a variety of features, but most of them are not included in its free plan. The free plan covers 200 pages of data in 40 minutes and 5 public projects.
Price plans start at $ 149 / m. So I can suggest that more features come at a higher cost. If your business is small, you may be better off using the free version or one of the cheaper web scrapers on our list.
## 6. Diffbot
[**Diffbot**](https://www.diffbot.com/) is another web scraping tool that provides data pulled from web pages. This data scraper is one of the best content extractors. It allows you to automatically identify pages with Analyze API function and extract products, articles, discussions, videos, or images.
**Features**
* API product
* Plain text and HTML
* St...
rqblwk
u_digitally_rajat
digitally_rajat
t3_rqblwk
https://www.reddit.com/r/u_digitally_rajat/comments/rqblwk/top_13_web_scraping_tools_in_2022/
12/28/2021 9:06:28 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 13 Web scraping tools in 2022
False
1
rqblwk
0
1
56
56
58
4.3186895011169
10
0.744601638123604
0
0
731
54.4303797468354
1343
128, 128, 128
3
Solid
50
No
394
Commented
3/11/2022 9:19:52 AM
Thanks for sharing! I think ScrapeStorm is also a good web scraping tool, you can have a try.
i07zb69
u_digitally_rajat
Born-Project89757
t1_i07zb69
https://www.reddit.com/r/u_digitally_rajat/comments/rqblwk/top_13_web_scraping_tools_in_2022/i07zb69/
3/11/2022 9:19:52 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
rqblwk
t3_rqblwk
rqblwk
0
rqblwk
False
False
False
0
1
56
56
1
5.55555555555556
0
0
0
0
8
44.4444444444444
18
128, 128, 128
3
Solid
50
No
393
RepliedTo
11/8/2016 3:53:29 PM
can you give me some insight into how you were able to make this happen?
d9r6l07
socialmedia
digitalAlchemist_
t1_d9r6l07
https://www.reddit.com/r/socialmedia/comments/5boj4i/social_media_likes_as_triggers_for_app/d9r6l07/
11/8/2016 3:53:29 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
d9qoaeg
t1_d9qoaeg
d9qoaeg
0
5boj4i
False
False
False
1
1
57
57
0
0
0
0
0
0
4
26.6666666666667
15
128, 128, 128
3
Solid
50
No
391
Commented
5/2/2019 9:18:40 PM
If it’s in JavaScript, Scrapy + Splash callback in python will work well.
emcpdky
webscraping
FullMetalMahnmut
t1_emcpdky
https://www.reddit.com/r/webscraping/comments/bjx3xo/is_it_even_possible_to_scrape_salesflower/emcpdky/
5/2/2019 9:18:40 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
bjx3xo
t3_bjx3xo
bjx3xo
0
bjx3xo
False
False
False
0
1
42
42
2
15.3846153846154
0
0
0
0
5
38.4615384615385
13
128, 128, 128
3
Solid
50
No
392
Posted
5/2/2019 4:41:34 PM
Hi all, thanks for checking this out.
My boss has sent me on a mission to scrape a bunch of leads from Salesflower while we still have the free trial. Never done any sort of scraping before.
I'm starting to think it's not possible with just a normal scraper like Octoparse or ParseHub - can't seem to get it to work and the page constantly refreshes and logs out, making it frustrating to say the least
bjx3xo
webscraping
nertynertt
t3_bjx3xo
https://www.reddit.com/r/webscraping/comments/bjx3xo/is_it_even_possible_to_scrape_salesflower/
5/2/2019 4:41:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Is it even possible to scrape SalesFlower?
False
1
bjx3xo
0
1
42
42
3
4
1
1.33333333333333
0
0
28
37.3333333333333
75
128, 128, 128
3
Solid
50
No
390
Commented
5/2/2019 7:03:00 PM
The site is probably rendered dynamically with javascript instead of being pure HTML.
You'll need something like selenium or puppeteerJS to scrape it. Free online scrapers aren't going to cut it.
emcal4e
webscraping
prodiver
t1_emcal4e
https://www.reddit.com/r/webscraping/comments/bjx3xo/is_it_even_possible_to_scrape_salesflower/emcal4e/
5/2/2019 7:03:00 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
bjx3xo
t3_bjx3xo
bjx3xo
0
bjx3xo
False
False
False
0
1
42
42
2
6.45161290322581
0
0
0
0
17
54.8387096774194
31
128, 128, 128
3
Solid
50
No
389
Posted
1/28/2022 4:16:16 PM
Hello! I'm very new to web scraping and I had an idea for an app that involves scraping items from delivery services, such as DoorDash, using Octoparse 8. I tried to follow the tutorial on the website by clicking the post block for an item, but when I try to select another item in the same category, it does not see the sub-elements. It only finds the first entry for each category, and the rest are showing as just green without the sub-elements highlighted. Does anyone know how to parse each post block regardless of category? I do know a bit of python but almost nothing about HTML or Xpath. I also tried to find a duplicate in this sub-reddit but could not find what I was looking for.
[Video of Attempt to Scrape the Data](https://imgur.com/a/HDjyHVQ)
Edit: [Here's the tutorial I used](https://www.octoparse.com/tutorial-7/capture-a-list-of-items)
seuc1y
webscraping
jrusse
t3_seuc1y
https://www.reddit.com/r/webscraping/comments/seuc1y/scraping_delivery_services_with_octoparse_8/
1/28/2022 4:16:16 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping Delivery Services With Octoparse 8
False
1
seuc1y
0
1
1
1
0
0
0
0
0
0
78
48.4472049689441
161
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
388
Posted
6/25/2022 6:56:54 AM
[removed]
vk95ho
PiratedGames
Smooth-Solution4108
t3_vk95ho
https://www.reddit.com/r/PiratedGames/comments/vk95ho/octoparse_842_crack_with_activation_key_x64_free/
6/25/2022 6:56:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse 8.4.2 Crack With Activation Key (x64) Free Download 2022
False
1
vk95ho
0
4
9
9
0
0
0
0
0
0
1
100
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
387
Commented
6/25/2022 6:56:54 AM
Make sure to read the stickied [megathread](https://rentry.org/pgames-mega-thread), as it might just answer your question! Also check out our [videogame piracy guide](https://www.reddit.com/r/PiratedGames/comments/i3r14g/a_beginners_guide_to_video_game_piracy/) and the list of Common Q&A [part 1](https://www.reddit.com/r/PiratedGames/comments/fvix6e/common_questions_and_answers_thread/) and [part 2](https://www.reddit.com/r/PiratedGames/comments/igxebs/frequently_asked_questions_part_2/). Or just read the whole [Wiki](https://www.reddit.com/r/PiratedGames/wiki/index).
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/PiratedGames) if you have any questions or concerns.*
idntuyc
PiratedGames
AutoModerator
t1_idntuyc
https://www.reddit.com/r/PiratedGames/comments/vk95ho/octoparse_842_crack_with_activation_key_x64_free/idntuyc/
6/25/2022 6:56:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
vk95ho
t3_vk95ho
vk95ho
0
vk95ho
False
False
True
0
4
9
9
0
0
1
0.917431192660551
0
0
54
49.5412844036697
109
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
386
Posted
6/25/2022 6:56:54 AM
[removed]
vk95ho
PiratedGames
Smooth-Solution4108
t3_vk95ho
https://www.reddit.com/r/PiratedGames/comments/vk95ho/octoparse_842_crack_with_activation_key_x64_free/
6/25/2022 6:56:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse 8.4.2 Crack With Activation Key (x64) Free Download 2022
False
1
vk95ho
0
4
9
9
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
385
Commented
6/25/2022 6:56:54 AM
Make sure to read the stickied [megathread](https://rentry.org/pgames-mega-thread), as it might just answer your question! Also check out our [videogame piracy guide](https://www.reddit.com/r/PiratedGames/comments/i3r14g/a_beginners_guide_to_video_game_piracy/) and the list of Common Q&A [part 1](https://www.reddit.com/r/PiratedGames/comments/fvix6e/common_questions_and_answers_thread/) and [part 2](https://www.reddit.com/r/PiratedGames/comments/igxebs/frequently_asked_questions_part_2/). Or just read the whole [Wiki](https://www.reddit.com/r/PiratedGames/wiki/index).
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/PiratedGames) if you have any questions or concerns.*
idntuyc
PiratedGames
AutoModerator
t1_idntuyc
https://www.reddit.com/r/PiratedGames/comments/vk95ho/octoparse_842_crack_with_activation_key_x64_free/idntuyc/
6/25/2022 6:56:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
vk95ho
t3_vk95ho
vk95ho
0
vk95ho
False
False
True
0
4
9
9
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
625
Posted
8/23/2022 12:08:00 PM
In order to gain actionable insights, many companies and analysts depend on Amazon, one of the largest e-commerce websites, for data. It has been the top priority of the businesses looking for collecting, storing, and analyzing a large amount of data. Be it product prices, seller details, or even information on general market trends, anything can be captured from Amazon to make smart business decisions. The growing eCommerce store demands sophisticated [Amazon scraper](https://outsourcebigdata.com/blog/amazon-scraper/why-pay-for-amazon-scraper-try-free-web-scraper-for-amazon/) to collect data. Amazon has basic anti-scraping measures in place since it has a vested interest in safeguarding its data. Only a cutting-edge Amazon scraper can extract all the information you need from Amazon. AI-powered web scraper for Amazon data scraping is in high demand as it delivers high accuracy, flexibility, and scalability.
By using AI-powered Amazon scraper software, you can easily scrape Amazon data at scale and predict the next big shopping trend. A state-of-the-art [Amazon scraping tool](https://outsourcebigdata.com/data-automation/web-scraping-services/amazon-scraping-tool/) can reduce all the challenges that come your way when you scrape data from Amazon.
## What is an Amazon Scraper?
The concept of Amazon scraping is not new. Businesses have been gaining market insights by collecting Amazon data using Amazon scraper. Amazon scraper is basically a tool used for extracting data from HTML and delivering that data in a comprehensive way. This digital bot is smartly programmed to gather data from Amazon. Web scraper for Amazon is designed to extract data effortlessly from Amazon. It is a handy and efficient way to collect specific data from a dynamic platform like Amazon. The product list of Amazon is extensive, so it’s obvious to use Amazon scraper software for data collection. 9 out of 10 consumers price check a product on Amazon which means, the prices of Amazon products seem to be very important. Using the Amazon scraping tool, you’ll be able to gather price data for use in research or corporate contexts, as well as for personal reference.
## What Is The Process Of Scraping Amazon?
Now that we know what an Amazon scraper is, it’s time to discuss how you can use it to scrape Amazon. Whether you are an individual or enterprise, with the use of Using that data, you can grow your business exponentially. A web scraper for Amazon is crafted to make your data scraping less challenging. Companies also provide Amazon scraping services and help businesses with their big data needs. If you choose Amazon scraping, then first, you’d have to search for the product you want to extract or scrape. Now navigate to the product detail page to collect detailed descriptions, product prices, all of the product images, detailed customer reviews, seller emails, etc. After that, you have to copy that data and paste it into the excel sheet. The manual Amazon scraping is not possible due to the massive product library of this platform. So, just sit back and relax! Offshore AIMLEAP for scraping AMAZON product data. Using AI-powered, smart Amazon scraper software, AIMLEAP delivers high-quality, accurate data to its customers. Stop tiring out your eyes scraping Amazon manually. Take the help of experts to give success to the process.
# What Kind Of Data Is Obtained From Amazon Scraper?
So, the number of products on Amazon is huge. Product listing on Amazon is not limited to sharing just images of the products. It involves product prices, available offers, item details, specification, seller details, etc. When you use Amazon scraper, you can capture product specification, price, seller price and details, ASIN, sales rank, product image, and customer review. Amazon scrapers are designed to collect the minutest detail from Amazon. Web scraper for Amazon data scraping uses smart techniques for careful data observation and collection. The data you collect using Amazon scraper software can be used to evaluate the competition, sentiment analysis, online reputation monitoring, determine product ranking, etc. By using the Amazon scraping tool, businesses can get a wealth of information with high potential waiting to be used for smart decision making.
## How You Can Scrape Amazon For Free?
If you scrape Amazon data, you are more likely to get data from other online marketplaces. There is a wide range of free Amazon scraper software in the market. By using a free Amazon scraper, you can extract the whole product list of Amazon in an easy-to-understand format. Search for any free Amazon scraper with rich features and functionalities and start web scraping without any hassle. [Web scraper](https://outsourcebigdata.com/data-automation/web-scraping-services/) for Amazon with a built-in AI-powered mechanism eliminates all the obsolete data and delivers authentic data. A free Amazon scraping tool comes with an outcome-based price model which allows the users to pay for what they consume. But there are no charges for using the tool.
## Top Free Amazon Scrapers
Whether you need to scrape Amazon at a large scale or small scale, with the use of Amazon scraper you will not experience any kind of hassle. Free Amazon scrapers are available in the market and are highly advanced. They can easily bypass IP blocks, CAPTCHAs, and even a deceitful HTTP 200 success code with no delay in the delivery of data. Free web scraper for Amazon not only does save your money but also improves the efficiency of your data scraping process. Here, we are going to present the list of top free Amazon scraper software that can fulfill your data scraping needs.
### 1. ApiScrapy
[ApiScrapy](https://apiscrapy.com/) is a leading company that provides a pre-built advanced Amazon scraper to make data scraping easy for the user. Accurate data can be collected in a predefined format of your choice. ApiScrapy’s goal is to make large-scale data accessible at an affordable price. Its Amazon scraper software can extract data from multiple web pages within minutes. One can also schedule data scraping at one’s convenience using cutting-edge data scrapers from ApiScrapy.
### 2. Data Miner
Data Miner is a Google Chrome extension that works as an efficient Amazon scraper. It helps you collect data from web pages into a CSV file or Excel spreadsheet. On this web scraper for Amazon, a number of custom recipes are available that help you scrape data at scale. The Amazon scraper software comes with a friendly user interface and advanced features to help you execute advanced data extraction. Small businesses can choose Data Miner for casual use. For increased data scraping needs, other paid plans are available.
### 3. Web Scraper
It is an Amazon scraper as an extension tool. With a point and click interface, it simplifies data extraction from sites with multiple levels of navigation. Use this Amazon scraping tool to scrape webpages and export data in CSV format. Take use of the API and webhooks provided by Web Scraper Cloud to receive data exported to CSV, XLSX, and JSON formats. This advanced tool enables data extraction to be tailored to diverse site structures.
### 4. Scraper Parsers
Extract unstructured data in a structured format (XLSX, XLS, XML, CSV) with Scraper Parsers free Amazon scraper. When you use this web scraper for Amazon, you can scrape URLs, images, tables, single data, directories, JavaScripts. Using Scraper Parsers Amazon scraper software, you can schedule scraping and get real-time data as per your business needs. There is no limit to scraping web pages using this advanced browser extension tool.
### 5. Amazon Scraper – Trial Version
Designed for Amazon, use Amazon scraper -Trial Version to scrape prices, shipping cost, product header, product information, product images, and ASIN for any Amazon search page. Using this Amazon scraping tool, extraction of Amazon data becomes effortless. The trial version is limited to downloading the first two pages of any search result. A full version license allows you to download an infinite number of pages and gives you free support for the first year after purchase.
#...
wvn3o2
u_outsourcebigdata
outsourcebigdata
t3_wvn3o2
https://www.reddit.com/r/u_outsourcebigdata/comments/wvn3o2/why_pay_for_amazon_scraper_try_free_web_scraper/
8/23/2022 12:08:00 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Why pay for Amazon scraper, try free web scraper for Amazon
False
1
wvn3o2
0
4
15
15
63
4.68401486988848
11
0.817843866171004
0
0
750
55.7620817843866
1345
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
624
Posted
8/18/2022 1:10:25 PM
# 10 Free Web Scraping Software You Should Explore In 2022
When it comes to making important decisions, both businesses and individuals rely on mission-critical data. It’s impossible to collect voluminous data manually and that’s where the [web scraping](https://outsourcebigdata.com/data-automation/web-scraping-services/) software comes into play. Well-designed, advanced data scraping tools make the entire process of [data extraction](https://outsourcebigdata.com/data-automation/web-scraping-services/web-data-extraction-services/) easy and fast. You can extract a massive amount of data in a structured format easily using a website data extractor. Today, businesses make use of AI-powered data extraction tools to automate web scraping and carry out web scraping effectively. Choosing web scraping as a service is also a smart idea, as you get experts to conduct the process of data extraction quickly, accurately, and within your budget.
## What is a Web Scraping Software?
Web scraping software is used to extracting and structuring the available raw data on the internet into the desired format of your choice to improve the decision-making capabilities of a business. Website data extractor or software improves the efficiency and accuracy of the entire data extraction process by using AI technology. When you use data scraping software to automate web scraping, you can automatically fetch URLs, videos, images, content, etc. in a structured format. Employees in a company waste up to half of their time dealing with mundane data quality tasks. (MITSloan) Bid adieu to time-consuming copy-paste data scraping and use feature-rich, AI-powered data scraping tools. The price of web scraping as a service varies from tool to tool. If you want to fetch a large volume of data in real-time faster, make use of a scraper intended for beginners to advanced users.
## Types Of Web Scraping Tools
Data scraping is done manually using the copy-pasting method and automatically using web scraping software. There is a surge in the adoption of automated data scraping tools for fetching a large volume of data.
With ever-changing digital market trends, tech companies are assisting business owners and individuals in decision making by providing them with fully managed web scraping as a service. Collecting data has become a breeze, as one can use a pre-built or custom website data extractor from a trusted offshore company to automate web scraping tasks.
### a) Web-based Scraping Application
Web scraping software or application browses the web to extract data without having to code. It is a pre-built tool delivering web scraping as a service and allows you to collect accurate and precise data. Businesses can use web-based scraping applications to reduce the cost of building a new scraper and automate the web scraping process..
### b) Web Scraping Plugin/Extension
Web scraping plugin/extension is something that can be added to your browser to get data out of web pages and into spreadsheets. A web scraping plugin/extension runs through the browser and scrapes websites in a few clicks. Within a simple point-and-click interface, the user can extract thousands of records from a website rapidly. You don’t need coding experience to start scraping, as web data extraction browser extension/plugin can automate web scraping.
### c) Web Scraping Tools Client-based
Many companies provide web scraping software or applications that can be personalised according to different business needs. They deliver web scraping as a service especially for business owners and enterprises. An AI-powered web scraping tool designed keeping your business requirements in mind can deliver unmatched results. The client-based web scraping tools also have integration options that enable better automation of the whole data scraping process.
## Why Pay When There Are Free Web Scraping Software Available?
Businesses that want to save money often look for “one size fits all” web scraping software solutions. Depending on the type of business you are involved in, there are hundreds of free data scraping tools that are going to save you from unwanted expenses and hassles. Free pre-built website data extractors are readily available online for the users. Unlike custom software, they are affordable and very easy to set up and use. From integration to management to maintenance, the web scraping as a service provider handles it all for you. Why pay for web scraping software or service when you have plenty of free options available in the market to automate the web scraping process. Using an AI-powered data scraper is highly recommended as it increases data extraction efficiency and accuracy.
## Top 10 Free Web Scraping Software
Luckily people who don’t know coding have access to free data scraping tools that perfectly match their data requirements. Dynamically designed web scraping software allows people to obtain web data at a large scale fast. With a smart AI-powered mechanism, a free website data extractor can crawl millions of websites and download data in the format of the user’s choice. Automate web scraping and fetch high-quality data within seconds as your business’ decisions depend upon market insights that you collect with the help of data. For businesses that look for bulk data scraping but don’t have a development team to put together scraping solutions, web scraping as a service provider comes to the rescue.
### 1. ApiScrapy
[ApiScrapy](https://apiscrapy.com/) provides users with access to free web scraping software that helps them fetch high-quality data at scale. They have 10K+ pre-built data scraping tools designed by an army of skilled developers for different business requirements. Use an AI-powered website data extractor from ApiScrapy to fetch millions of data sets in minutes. The tool delivers data in a pre-defined format and charges according to the outcomes delivered to the users.
### 2. Octoparse
Built for businesses and enterprises, [web scraping software](https://outsourcebigdata.com/data-automation/web-scraping-services/) from Octoparse makes data scraping easy. Professionals without coding skills can use the Octoparse website data extractor. With its intuitive user interface, users can scrape data effortlessly. It is free web scraping as a service software that provides ready-to-use web scraping templates to extract data from digital platforms.
### 3. Content Grabber
Content Grabber is a powerful, visual web scraping software that automatically harvests data from digital platforms and delivers it in multiple database formats such as Excel spreadsheets, CSV or XML files. This tool automates web scraping and extracts data from websites where most other extraction tools are incapable.
### 4. Import.io
Import.io is web scraping as a service software that can integrate the web data into analytic tools to gain authentic market insights. Using this web scraping software, users can automate the web scraping cycle and harvest data in the structured format of their choice.
### 5. Mozenda
Fulfill your scalable data needs with Mozenda’s free web scraping software. One of the best data scraping tools designed by Mozenda helps companies collect and organize data in the most efficient and cost-effective way. The website data extractor can be integrated with any business system without IT involvement.
### 6. Parsehub
Are you a researcher/data analyst who lacks programming skills? Adopt a Parsehub website data extractor that reduces the hassles involved in data harvesting from dynamic websites. This web scraping software also includes an IP rotation feature that allows you to change your IP address while visiting websites that use anti-scraping measures.
### 7. Crawlmonster
The dynamic web scraping software is designed for SEO experts and marketers. It is one of the best data scraping tools available to users for free. Using this website data extractor, users can crawl websites to analyse their content, source code, page status, etc. Using this tool, web scraping becomes a hassle-free task.
### 8. Diffbot
Diffbot is smart web scraping software that uses machine learning to extract hi...
wrj7qz
u_outsourcebigdata
outsourcebigdata
t3_wrj7qz
https://www.reddit.com/r/u_outsourcebigdata/comments/wrj7qz/10_free_web_scraping_software_you_should_explore/
8/18/2022 1:10:25 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10 Free Web Scraping Software You Should Explore In 2022
False
1
wrj7qz
0
4
15
15
62
4.74006116207951
10
0.764525993883792
0
0
785
60.0152905198777
1308
128, 128, 128
3
Solid
50
No
382
Commented
8/18/2022 6:01:58 PM
u/outsourcebigdata you should check out Browse AI too! (https://www.browse.ai)
iktpf71
u_outsourcebigdata
ardalanme
t1_iktpf71
https://www.reddit.com/r/u_outsourcebigdata/comments/wrj7qz/10_free_web_scraping_software_you_should_explore/iktpf71/
8/18/2022 6:01:58 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
wrj7qz
t3_wrj7qz
wrj7qz
0
wrj7qz
False
False
False
0
1
15
15
0
0
0
0
0
0
6
46.1538461538462
13
128, 128, 128
3
Solid
50
No
383
Commented
10/10/2022 5:00:50 PM
If it’s less than 200 scrapes a month, www.browse.ai can do it for free. It’s pretty easy to use and their api has c# examples too
irs5npl
csharp
ardalanme
t1_irs5npl
https://www.reddit.com/r/csharp/comments/xysv32/looking_for_some_project_suggestions_for_scraping/irs5npl/
10/10/2022 5:00:50 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
xysv32
t3_xysv32
xysv32
0
xysv32
False
False
False
0
1
15
15
3
10
0
0
0
0
9
30
30
128, 128, 128
3
Solid
50
No
381
Commented
10/9/2022 7:26:55 AM
You could use the GPT-3 API, feed it the site's text, then prompt it for structural information, like "The author of above article is:" and so on.
Pros: May work across many sites without knowing the html in advance. Would also be resilient to html changes on the site. Would even allow you to query for custom summaries of translations.
Cons: API usage costs money. And it would need to be tested if this approach really works well.
So, a lot depends on your project scope, goals and publishing scenario.
irlxr3g
csharp
Philipp
t1_irlxr3g
https://www.reddit.com/r/csharp/comments/xysv32/looking_for_some_project_suggestions_for_scraping/irlxr3g/
10/9/2022 7:26:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
xysv32
t3_xysv32
xysv32
0
xysv32
False
False
False
0
1
15
15
6
6.59340659340659
1
1.0989010989011
0
0
40
43.956043956044
91
128, 128, 128
3
Solid
50
No
384
Posted
10/8/2022 1:29:46 PM
I just looking for some suggestions towards existing c# projects or articles re: scraping a url.
I'd like to be able to send a url, often from anews article and get back things the author, maybe some keywords. I've seen scrapy has a c# port but was just wondering if anyone had done something along these lines already.
The examples I've seen so far lead me to believe I would have to have a different scraping profile for each site.
I have looked at OctoParse and a few others but I haven't had much success...
xysv32
csharp
azraels_ghost
t3_xysv32
https://www.reddit.com/r/csharp/comments/xysv32/looking_for_some_project_suggestions_for_scraping/
10/8/2022 1:29:46 PM
10/8/2022 1:56:03 PM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Looking for some project suggestions for Scraping a News Url for specific content
False
0.67
xysv32
0
1
15
15
2
2.10526315789474
0
0
0
0
43
45.2631578947368
95
128, 128, 128
3
Solid
50
No
380
Commented
10/8/2022 2:03:30 PM
Yes different for each site. I use htmlsgilitypack. On GitHub and nuget
irilq7o
csharp
rickrat
t1_irilq7o
https://www.reddit.com/r/csharp/comments/xysv32/looking_for_some_project_suggestions_for_scraping/irilq7o/
10/8/2022 2:03:30 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
xysv32
t3_xysv32
xysv32
1
xysv32
False
False
False
0
1
15
15
0
0
0
0
0
0
8
66.6666666666667
12
128, 128, 128
3
Solid
50
Yes
379
RepliedTo
10/9/2022 10:02:39 AM
I like AngleSharp better for its CSS query selectors (like JQuery).
irm90ph
csharp
JTarsier
t1_irm90ph
https://www.reddit.com/r/csharp/comments/xysv32/looking_for_some_project_suggestions_for_scraping/irm90ph/
10/9/2022 10:02:39 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
irilq7o
t1_irilq7o
irilq7o
1
xysv32
False
False
False
1
1
15
15
1
9.09090909090909
0
0
0
0
5
45.4545454545455
11
128, 128, 128
3
Solid
50
Yes
378
RepliedTo
10/9/2022 10:18:02 AM
I haven’t tried it yet.
irma31k
csharp
rickrat
t1_irma31k
https://www.reddit.com/r/csharp/comments/xysv32/looking_for_some_project_suggestions_for_scraping/irma31k/
10/9/2022 10:18:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
irm90ph
t1_irm90ph
irm90ph
0
xysv32
False
False
False
2
1
15
15
0
0
0
0
0
0
2
33.3333333333333
6
128, 128, 128
3
Solid
50
Yes
377
RepliedTo
7/17/2022 2:58:15 PM
What categories do you analyze reviews for in meltwater? One of my clients is a QSR brand and I don’t think I’ve been able to find local reviews (e.g., google maps, yelp) on meltwater, but hoping maybe I’m missing something! I know we can buy this data elsewhere, but I’d love to look at this data as a value-add/wouldn’t go the route of buying it separately for this particulate use case.
ETA: we also have forsta surveys/decipher. Which forsta tools do you use for scraping? Wondering if I could get our rep to set us up with a trial.
Thanks in advance!!
igivhl7
Marketresearch
-dikki
t1_igivhl7
https://www.reddit.com/r/Marketresearch/comments/vzxb2l/deleted_by_user/igivhl7/
7/17/2022 2:58:15 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
igcssx1
t1_igcssx1
igcssx1
1
vzxb2l
False
False
False
1
1
58
58
1
0.900900900900901
0
0
0
0
52
46.8468468468468
111
128, 128, 128
3
Solid
50
Yes
376
RepliedTo
7/17/2022 8:59:27 PM
I know Forsta can scrape product reviews - have a chat to your Rep.
I’ve only had a light play with Meltwater
igk9azy
Marketresearch
Saffa1986
t1_igk9azy
https://www.reddit.com/r/Marketresearch/comments/vzxb2l/deleted_by_user/igk9azy/
7/17/2022 8:59:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
igivhl7
t1_igivhl7
igivhl7
0
vzxb2l
False
False
False
2
1
58
58
0
0
0
0
0
0
11
50
22
128, 128, 128
3
Solid
50
No
375
Posted
4/21/2023 5:21:39 AM
https://youtube.com/watch?v=0P6jPOrAesc&feature=share
12trdgl
u_limjetwee
limjetwee
t3_12trdgl
https://www.reddit.com/r/u_limjetwee/comments/12trdgl/octoparse_a_nocode_web_scrapper/
4/21/2023 5:21:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
OctoParse ~ A No-Code Web Scrapper
False
1
12trdgl
0
1
1
1
128, 128, 128
3.01419110690634
Dash Dot Dot
49.9391809704014
No
374
Posted
9/21/2020 9:51:32 AM
**Table of Contents**
[3 Practical Uses of Ecommerce Data](https://www.octoparse.com/blog/3-most-practical-uses-of-ecommerce-data-scraping-tools#h1)
[3 popular eCommerce data scraping tools](https://www.octoparse.com/blog/3-most-practical-uses-of-ecommerce-data-scraping-tools#h2)
[Conclusion](https://www.octoparse.com/blog/3-most-practical-uses-of-ecommerce-data-scraping-tools#h3)
In today’s eCommerce world, eCommerce data scraping tools gain great popularity all over the world as the competition among eCommerce business owners gets more fierce year by year. Data scraping tools become the new technique and tool to help them improve their performance.
A lot of store owners find using an eCommerce data scraping tool to monitor competitors’ activities and customers’ behaviors can help them maintain their competitiveness and improve sales. If you have no idea how to make full use of eCommerce data scraping tools, stay with me and we will look into 3 most practical uses of a scraping tool and how the tool helps grow your business.
# Three Practical Uses of Ecommerce Data
## 1) [Price Monitoring](https://www.octoparse.com/blog/top-10-price-monitoring-tool)
Price is one of the most critical aspects that affect customers’ buying interest. 87% of online shoppers indicate that price is the most important factor that affects buying behaviors, followed by shipping cost and speed. That research suggests that a potential customer won’t hesitate to leave your store if your price doesn’t match his expectation.
In addition, according to a study from AYTM, 78 percent of shoppers compare prices between two or more brands, then opt for the lowest price. With easy access to many free online price comparison tools, online shoppers can easily see the price of a specific item across dozens of brands and marketplaces.
It is necessary for online business owners to have an eCommerce data scraping tool to scrape price information from competitors’ web pages or from price comparison Apps. If not, it’s likely that you will have trouble attracting new customers to your store or maintaining your current customer base, because you don’t know when and how to adjust your price to cater to those price-sensitive customers.
📷
## 2) [Competitor Analysis](https://www.octoparse.com/blog/competitor-monitoring-for-price-strategy-and-product-planning)
We’re aware that improving the shipping service is another solution to increase sales. 56% online sellers offer free shipping (and easy returns) regardless of the purchase price or the product type.
Lots of online sellers use free shipping as a marketing strategy to nudge people to buy from them or even buy more from them. For example, it’s quite common that customers are more willing to spend $100 on a product with free shipping rather than buy a $90 product that takes $10 for the shipping. Besides, it’s common for customers to buy more items in order to get a free shipping offer.
You can use an eCommerce data scraping tool to find out how many of your competitors are offering a free shipping service. Using a data scraping tool, you can easily scrape and collect the data in real-time. In this case, if they don’t provide a free shipping service, you can attract their customers by offering it.
## 3) Customer [Sentiment Analysis](https://www.octoparse.com/blog/text-mining-and-sentiment-analysis-using-python)
Knowing how your competitors’ audiences feel about the products or brands can help you evaluate your marketing strategy and customer experience management. ECommerce data scraping tools can help you gather such information.
The voices of customers that you gather from your competitors will help you understand what customers value and how you can better serve them. Their voices are mostly scattered among comments and conversation under your competitors’ stores and posts and interactions on their social media. With such information at hand, you will know what customers want from the product – what they like or dislike.
To outcompete your competitors, it is necessary for you to gain all those information, look into it and draw conclusions. Therefore you can adjust your marketing strategy or your products/services accordingly.
📷
*Now you are probably wondering what scraping tools can be used for these purposes. Here, I would like to share with you this shortlist of the most popular eCommerce data scraping tools. You should try them out!*
# 3 popular eCommerce data scraping tools
## 1) [Octoparse](http://www.octoparse.com/)
Octoparse is a free and powerful eCommerce data scraping tool with a user-friendly point-and-click interface. Both Windows and Mac users will find it easy-to-use for extracting almost all kinds of data you need from a website. With its brand new auto-detect algorithm, users with/without coding knowledge are able to extract tons of data within seconds.
**Pros:** Octoparse provides over 50 [pre-built templates](https://www.octoparse.com/blog/big-announcement-web-scraping-template-take-away) for all users, covering big websites such as Amazon, Facebook, Twitter, Instagram, Walmart, etc. All you need to do is to enter the keywords and URL, then wait for the data result. In addition, it provides a free version for all people. For premium users, they can use features such as crawler scheduling and [cloud extraction](https://helpcenter.octoparse.com/hc/en-us/articles/360018047092-What-is-Cloud-Extraction-) to make the process less time-consuming.
**Cons:** Octoparse cannot scrape data from PDF files. It can’t download files automatically, while it allows you to [extract the URLs of images](https://helpcenter.octoparse.com/hc/en-us/articles/360018047452-Can-Octoparse-extract-images-videos-files-), PDFs and other types of files. You can use automatic download software to [down these files in bulk](https://helpcenter.octoparse.com/hc/en-us/articles/360018324071-How-to-download-images-from-a-list-of-URLs-) with the URL scraped by Octoparse.
## 2) [Parsehub](https://www.parsehub.com/)
ParseHub works with single-page apps, multi-page apps and other modern web technology. ParseHub can handle Javascript, AJAX, cookies, sessions, and redirects. You can easily fill in forms, [loop through dropdowns](https://helpcenter.octoparse.com/hc/en-us/articles/360018281571-How-to-click-through-options-in-a-drop-down-menu-), [login to websites](https://helpcenter.octoparse.com/hc/en-us/articles/360018008832-Text-keyword-input), click on interactive maps and deal with websites that apply [infinite scrolling techniques](https://helpcenter.octoparse.com/hc/en-us/articles/360018281551-Dealing-with-Infinitive-Scrolling-Load-More).
**Pros:** Parsehub supports both Windows and Mac OS systems. It provides a free version for people with eCommerce data scraping needs.
**Cons:** The free version is quite limited with only 5 projects and 200 pages per run. It didn’t support documentation extraction. And some advanced functions are tricky to use sometimes.
## 3) [80legs](https://80legs.com/)
80legs is a web data extraction tool that allows users to create and run web crawlers through its software as a service platform. It’s built on top of a distributed grid computing network. This grid consists of approximately 50,000 individual computers distributed across the world and uses bandwidth monitoring technology to prevent bandwidth cap overages.
**Pros:** 80legs is more suitable for small companies and individuals. It offers unique service plans so that customers pay only for what they crawl.
**Cons:** 80legs is not able to help to get a huge amount of data, you must choose between custom set crawled data, [pre-built API](https://helpcenter.octoparse.com/hc/en-us/articles/360028160091-Connect-Octoparse-API-step-by-step), and crawl application to be developed.
# Conclusion
Once you know how to use eCommerce data scraping tools to help you get the needed data, what insights you can gain from the data is another story. Try to do some data analysis and find ways to visualize the data. Put your data into use.
You can try the simple analysis methods mentioned in this article to [get to know your users through data anal...
iwxe6i
scrapinghub
Mike_M1989
t3_iwxe6i
https://www.reddit.com/r/scrapinghub/comments/iwxe6i/3_most_practical_uses_of_ecommerce_data_scraping/
9/21/2020 9:51:32 AM
1/1/0001 12:00:00 AM
False
False
6
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
3 Most Practical Uses of eCommerce Data Scraping Tools
True
1
iwxe6i
0
16
59
59
42
3.15552216378663
11
0.826446280991736
0
0
722
54.2449286250939
1331
128, 128, 128
3.01419110690634
Dash Dot Dot
49.9391809704014
No
373
Posted
9/7/2020 10:08:47 AM
[Web data extraction](https://www.octoparse.com/blog/web-data-extraction-2020) is gaining popularity as one of the great ways to collect useful data to fuel the business cost-effectively. Although web data extraction has existed for quite some time, it has never been as heavily used, or as reliable as it is today. This guide aims to help web scraping beginners to get a general idea of web data extraction.
**Table of Contents**
**What is web data extraction**
**+ Benefits of web data extraction**
* E-commerce price monitoring
* Marketing analysis
* Lead generation
**+ Web data extraction for non-programmers**
* Octoparse
* Cyotek WebCopy
* Getleft
* OutWit Hub
* WebHarvy
**Conclusions**
**What is web data extraction**
Web data extraction is a practice of massive data copying done by bots. It has many names, depending on how people would like to call it, web scraping, data scraping, web crawling, to name a few. The data extracted(copied) from the internet can be saved to a file in your computer, or database.
**Benefits of web data extraction**
Businesses can get a load of benefits from web data extraction. It can be used more widely than you expect, but it would suffice to point out how it is used in a few areas.
**1** **E-commerce price monitoring**
The importance of price monitoring speaks for itself, especially when you sell items on an online marketplace such as Amazon, eBay, Lazada, etc. These platforms are transparent, that is, buyers, also any one of your competitors, have easy access to prices, inventory, reviews, and all kinds of information for each store. which means you can’t just focus on the price but also need to keep an eye on other aspects of your competitors. Hence in addition to prices, there are more available for you to dig into. Price monitoring may be more than prices.
Most retailers and e-commerce vendors try to put as much information about their products online as possible. This is helpful for buyers to evaluate, but also is too much exposure for the store owners because, with such information, competitors can get a glimpse of how you run your business. Fortunately, you can use these data to do the same thing.
You should gather information such as price, inventory levels, discounts, product turnover, new items added, new locations added, product category ASP, etc, from your competitors as well. With these data at hand, you can fuel your business with below benefits rendered by web data extraction.
1. Increase margins and sales by adjusting prices at the right time on the right channels.
2. Maintain or improve your competitiveness in the marketplace.
3. Improve your cost management by using competitor prices as a negotiating ground with suppliers, or review your own overheads and production cost.
4. Come up with effective pricing strategies, especially during promotion such as season-end sales or holiday seasons.
**2 Marketing Analysis**
Almost everyone can start their own business as long as they go online thanks to the easy entry brought by the magic Internet. Businesses increasingly sprout on the Internet signifies that competition among retailers will be more fierce. To make your business stand out and to maintain sustainable growth, you can do more than just lower your price or launch advertising campaigns. They could be productive for a business in an initial stage, while in the long run, you should keep an eye on what other players are doing and condition your strategies to the ever-changing environment.
You can study your customers and your competitors by scraping product prices, customer behaviors, product reviews, events, stock levels, and demands, etc. With this information, you’ll gain insights on how to improve your service and products and how to stand out among your competitors. Web data extraction tools can streamline this process, providing you with always up-to-date information for marketing analysis.
Get a better understanding of your customers’ demands and behaviors, and then find some specific customers’ needs to make exclusive offerings.
1. Analyze customer reviews and feedback for products and services of your competitors to make improvements to your own product.
2. Make a predictive analysis to help foresee future trends, plan future strategies and timely optimize your prioritization.
3. Study your competitors’ copies and product images to find out the most suitable ways to differentiate yourself.
**3 Lead generation**
There is no doubt that being capable of generating more leads is one of the significant skills to grow your business. How to generate leads effectively? A lot of people talk about it but few of them know how to make it. Most salespeople, however, are still looking for leads on the Internet in a traditional, manual way. What a typical example of wasting time on trivia.
Nowadays, smart salespeople will search for leads with the help of web scraping tools, running through social media, online directories, websites, forums, etc, so as to save more time to work on their promising clients. Just leave this meaningless and boring lead copying work to your crawlers.
When you use a web crawler, don’t forget to collect the information below for lead analysis. After all, not every lead is worth spending time on. You need to prioritize the prospects who are ready or willing to buy from you.
1. Personal information: Name, age, education, phone number, job position, email
2. Company information: Industry, size, website, location, profitability
As time passes by, you’ll collect a lot of leads, even enough to build your own CRM. Having a database of email addresses of your target audience, you can send out information, newsletters, invitations for an event or advertisement campaigns in bulk. But beware of being too spammy!
**How does web data extraction work?**
After knowing what you can benefit from a web data extraction tool, you may want to build one on your own to harvest the fruits of this technique. It’s important to first understand how a crawler works and what web pages are built on before starting your journey of web data extraction.
1. Build a crawler with programming languages and then enter the URL of a website that you want to scrape from. It sends an HTTP request to the URL of the webpage. If the site grants you access, it responds to your request by returning the content of webpages.
2. Parse the webpage is only half of the web scraping. The scraper inspects the page and interprets a tree structure of the HTML. The tree structure works as a navigator will help the crawler follow the paths through the web structure to get the data.
3. After that, the web data extraction tool extracts the data fields you require to scrape and store it. Lastly, when the extraction is finished, choose a format and export the data scraped.
The process of web scraping is easy to understand, but it’s definitely not easy to build one from scratch for non-technical people. Luckily, there are many free web data extraction tools out there thanks to the development of big data. Stay tuned, there are some nice and free scrapers I would love to recommend to you.
**Web data extraction for non-programmers**
Here are 5 popular web data extraction tools rated by many non-technical users. If you’re new to the web data extraction, you should give it a try.
[Octoparse](https://www.octoparse.com/)
Octoparse is a powerful website data extraction tool Its user-friendly point-and-click interface can guide you through the entire extraction process effortlessly. What's more, the auto-detection process and ready-to-use templates make scraping much easier for new starters.
**Cyotek WebCopy**
It is self-evident that WebCopy serves as a data extraction tool for websites. It is a free tool for copying full or partial websites locally onto your hard disk for offline reach. WebCopy will scan the specified website and download its content onto your hard disk. Links to resources such as style-sheets, images, and other pages on the website will automatically be remapped to match the local path. Using its extensive configuration you can define which parts of a website will be copied and how.
...
io4v3j
scrapinghub
Mike_M1989
t3_io4v3j
https://www.reddit.com/r/scrapinghub/comments/io4v3j/web_data_extraction_the_definitive_guide_2020/
9/7/2020 10:08:47 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Data Extraction: The Definitive Guide 2020
True
0.56
io4v3j
0
16
59
59
72
5.28246515040352
9
0.66030814380044
0
0
668
49.0095377842993
1363
128, 128, 128
3.01419110690634
Dash Dot Dot
49.9391809704014
No
372
Posted
9/1/2020 9:25:39 AM
Financial market is a place of risks and instability. It’s hard to predict how the curve will go and sometimes, for investors, one decision could be a make-or-break move. That’s why experienced practitioners never lose track of financial data.
We human beings are wired to see in the short term. Unless we have a database with data in a good structure, we are not able to get a handle on voluminous information. Data scraping is the solution that gets complete data at your fingertip.
**Table of Contents**
[What We Are Scraping When We Scrape Financial Data?](https://www.octoparse.com/blog/scrape-financial-data-without-python#h1)
[Why Scrape Financial Data?](https://www.octoparse.com/blog/scrape-financial-data-without-python#h2)
[How to Scrape Financial Data without Python](https://www.octoparse.com/blog/scrape-financial-data-without-python#h3)
[Let’s get started!](https://www.octoparse.com/blog/scrape-financial-data-without-python#h4)
# What We Are Scraping When We Scrape Financial Data?
When it comes to scraping financial data, stock market data is in the spotlight of attention. But there’s more, trading prices and changes of securities, mutual funds, futures, cryptocurrencies, etc. Financial statements, press releases, and other business-related news are also sources of financial data that people will scrape.
# Why Scrape Financial Data?
Financial data, when extracted and analyzed in real-time, can provide wealthy information for investments and trading. And people in different positions scrape financial data for varied purposes.
* **Stock market prediction**
Stock trading organizations leverage data from online trading portals like [Yahoo Finance](https://in.finance.yahoo.com/) to keep records of stock prices. This financial data help companies to predict the market trends and buy/sell stocks for the highest profits. Same for trades in futures, currencies, and other financial products. With complete data at hand, cross-comparison becomes easier and a bigger picture manifests.
* **Equity research**
“Don’t put all the eggs in one basket.” Portfolio managers do equity research to predict the performance of multiple stocks. Data is used to identify the pattern of their changes and further develop an algorithmic trading model. Before getting to this end, a vast amount of financial data will involve in the quantitative analysis.
* **Sentiment analysis of the financial market**
Scraping financial data is not merely about numbers. Things can go qualitatively. We may find that the presupposition raised by Adam Smith is untenable - people are not always economic, or say, rational. Behavioral economics reveals that our decisions are susceptible to all kinds of cognitive biases, plainly, emotions.
Using the data from financial news, blogs, relevant social media posts, and reviews, financial organizations can perform sentiment analysis to grab people’s attitudes towards the market, which can be an indicator of the market trend.
# How to Scrape Financial Data without Python
If you are a non-coder, stay tuned, let me explain how you can scrape financial data [with the help of Octoparse](https://www.octoparse.com/download). Yahoo Finance is a nice source to get comprehensive and real-time financial data. I will show you below how to scrape from the site.
Besides, there are lots of financial data sources with up-to-date and valuable information you can scrape from, such as [Google Finance](https://www.google.com/finance), [Bloomberg](https://www.bloomberg.com/markets/stocks/world-indexes/americas), [CNNMoney](https://money.cnn.com/data/markets/), [Morningstar](https://www.morningstar.com/), [TMXMoney](https://www.tmxmoney.com/en/index.html), etc. All these sites are HTML codes in nature, which means that all the tables, news articles, and other texts/URLs can be extracted in bulk by a web scraping tool.
*To know more about what web scraping is and what it is used for, you can check out* [*this article*](https://www.octoparse.com/blog/big-data-what-is-web-scraping-and-why-does-it-matter).
# Let’s get started!
**There are 3 ways to** **get the data:**
**📷** **Use a web scraping template**
**📷** **Build your web crawlers**
**📷** **Turn to data scraping services**
# 1. Use a Yahoo Finance web scraping template
In order to help newbies get an easy start on web scraping, Octoparse offers an array of [web scraping templates](https://www.octoparse.com/blog/big-announcement-web-scraping-template-take-away). These templates are preformatted crawlers ready-to-use. Users can pick one of them to pull data from respective pages instantly.
📷
The Yahoo Finance template offered by Octoparse is designed to scrape the Cryptocurrency data. No more configuration is required. Simply click “try it” and you will get the table data in minutes.
📷
# 2. Build a crawler from scratch in 2 steps
In addition to Cryptocurrency data, you can also build a crawler from scratch in 2 steps to scrape [world indices from Yahoo Finance](https://finance.yahoo.com/world-indices). A customized crawler is highly flexible in terms of data extraction. This method is also workable to scrape other pages from Yahoo Finance.
Step 1: Enter the web address to build a crawler
The bot will load the website in the built-in browser, and a one-click on the Tips Panel can trigger the auto-detection process and get the table data fields done.
📷
Step 2: Execute the crawler to get data
When your desired data are all highlighted in red, save the settings, and run the crawler. As you can see in the pop-up, all the data are scraped down successfully. Now, you can export the data into Excel, JSON, CSV, or your database via APIs.
📷
# 3. Financial data scraping services
If you are scraping financial data from time to time in a rather small amount, help yourself with handy web scraping tools. You may find joy in building your own crawlers. However, if you are in need of voluminous data for a profound analysis, say millions of records, and have a high standard of accuracy, it is better to hand your scraping needs to [a group of reliable web scraping professionals](https://service.octoparse.com/data-service).
**Why data scraping services deserve?**
1. Time and energy-saving
The only thing you would bother is to convey clearly to the data service provider what data you want. Once this is done, the data service team will deal with the rest of all, no hassle. You can plunge into your core business and do what you good at. Let professionals get the scraping job done for you.
1. Zero learning curve & tech issues
Even the easiest scraping tool takes time to master. The ever-changing environment on different websites may be hard to deal with. And when you are scraping on a large scale, you may encounter issues such as IP ban, low speed, duplicate data, etc. Data scraping services can free you from these troubles.
1. No legal violations
If you are not paying enough attention to the terms of service of the data sources you are scraping from, you may get yourself into trouble. With the support of experienced legal counsel, a professional web scraping service provider works in accordance with laws and the whole scraping process will be implemented in a legitimate manner.
Read more:
[Cryptocurrency Market Analysis with Web Scraping](https://www.octoparse.com/blog/cryptocurrency-market-analysis-with-web-scraping)
[Scrape information from Yahoo Finance](https://helpcenter.octoparse.com/hc/en-us/articles/360027003052-Scrape-information-from-Yahoo-Finance)
[Scrape Stock Info from Bloomberg](https://helpcenter.octoparse.com/hc/en-us/articles/360034323011-Scrape-Stock-Info-from-Bloomberg)
[Video: ](https://www.youtube.com/watch?v=ST5havU5GlY)[Web Scraping | Cryptocurrency Market](https://www.youtube.com/watch?v=ST5havU5GlY)
ikh4y5
webscraping
Mike_M1989
t3_ikh4y5
https://www.reddit.com/r/webscraping/comments/ikh4y5/3_ways_to_scrape_financial_data_without_python/
9/1/2020 9:25:39 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
3 Ways to Scrape Financial Data WITHOUT Python
True
0.67
ikh4y5
0
16
59
59
28
2.2029897718332
17
1.3375295043273
0
0
672
52.8717545239969
1271
128, 128, 128
3.01419110690634
Dash Dot Dot
49.9391809704014
No
371
Posted
10/16/2020 3:20:00 AM
https://i.redd.it/eqe426qaidt51.png
jc28li
infographic
Mike_M1989
t3_jc28li
https://www.reddit.com/r/infographic/comments/jc28li/an_infographic_designed_by_octoparse/
10/16/2020 3:20:00 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
An infographic designed by Octoparse. https://www.octoparse.com/blog/best-data-scraping-tools-for-2019-top-10-reviews
True
1
jc28li
0
16
59
59
128, 128, 128
3
Solid
50
No
370
Commented
2/20/2021 9:00:17 AM
Good article! As a digital marketer, I also use web scraping. I'd like to add some more cases when web crawling is helpful:
* a [web scraping service](https://www.finddatalab.com/) can let you speed up the process of lead generation;
* its lets you keep an eye on competitors' activities (not only pricing but also any updates on the website);
* it allows managing social media activities and scraping data about potential customers like their interests, opinions, etc.;
* web scraping lets digital marketers quickly find bad opinions about the brand across the web.
Further, web scraping is also used by students to find data for their research. What other applying fields do you know?
go3r6db
scrapinghub
Digital_Lover119
t1_go3r6db
https://www.reddit.com/r/scrapinghub/comments/iwxe6i/3_most_practical_uses_of_ecommerce_data_scraping/go3r6db/
2/20/2021 9:00:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iwxe6i
t3_iwxe6i
iwxe6i
0
iwxe6i
False
False
False
0
1
59
59
4
3.53982300884956
1
0.884955752212389
0
0
58
51.3274336283186
113
128, 128, 128
3
Solid
50
No
369
Posted
11/3/2018 5:41:08 PM
Hello people,
I have used octoparse as an easy way to scrape websites for a few school projects now and would like to incorporate this into my work. We have over 200 bitly links and unless you have bitly enterprise ($15,000 annual) they don't let you extract the data. I created an octoparse workflow that would enter the username, password and select the login button to get to the main dashboard. Once I am in I can select the content I want in a list an export it easily.
&#x200B;
THE ISSUE: bitly website uses AJAX to continuously scroll through your link clicks and populate 30 at a time. Even though I told octoparse to load the page as an AJAX and enabled the scrolling feature, I can't seem to grab more than the first 30 on the initial page load. The way the page is setup is that as soon as you login and start scrolling nothing happens because the top half of the header is a bar chart of all your links. The scrolling feature where I am scraping from is on the bottom left half of the website.
&#x200B;
Does anybody know how I can get the scrolling to work if it is only on a portion of the website? This would save me from either a) spending a shitload of time weekly doing it manually or b) $15,000 annually (lol).
&#x200B;
Please help! P.S. I am willing to do this in python, but then I would have to download beautiful soup and also the UI of octoparse is very nice and I would never need a premium license so I just figured for work I would take the easy route!
9tw1sa
scrapinghub
Black_Magic100
t3_9tw1sa
https://www.reddit.com/r/scrapinghub/comments/9tw1sa/using_octoparse_to_continuously_scrape_bitly_data/
11/3/2018 5:41:08 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Using octoparse to continuously scrape bitly data
False
1
9tw1sa
0
1
5
5
9
3.10344827586207
1
0.344827586206897
0
0
119
41.0344827586207
290
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
368
Commented
11/4/2018 8:01:59 PM
In that case do what the browser does and just make the XHR request by yourself, rather than trying to deal with all the HTML event plumbing. It should have the pleasing side-effect of dealing with JSON, too, rather than HTML
e91v56x
scrapinghub
mdaniel
t1_e91v56x
https://www.reddit.com/r/scrapinghub/comments/9tw1sa/using_octoparse_to_continuously_scrape_bitly_data/e91v56x/
11/4/2018 8:01:59 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9tw1sa
t3_9tw1sa
9tw1sa
1
9tw1sa
False
False
False
0
5
5
5
1
2.38095238095238
0
0
0
0
16
38.0952380952381
42
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
367
RepliedTo
11/4/2018 9:44:56 PM
I've never heard of an XHR request before. How do I do that?
e923amo
scrapinghub
Black_Magic100
t1_e923amo
https://www.reddit.com/r/scrapinghub/comments/9tw1sa/using_octoparse_to_continuously_scrape_bitly_data/e923amo/
11/4/2018 9:44:56 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
e91v56x
t1_e91v56x
e91v56x
1
9tw1sa
True
False
False
1
9
5
5
0
0
0
0
0
0
5
38.4615384615385
13
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
366
RepliedTo
11/5/2018 5:27:54 AM
"XHR" is the abbreviation of [XMLHttpRequest](https://xhr.spec.whatwg.org), which one can see via the [Chrome developer tools, Network Tab, XHR filter](https://duckduckgo.com/?q=chrome+developer+tools+network+xhr&atb=v73-4_q&iax=images&ia=images), and to get the data you'd just replay those requests from your scraping tool. Chrome also has a handy "right click, copy as cURL" option on any one of the request line items, if you want to try it out from the command line.
Although if you've never heard of XHR, this exercise will likely not end well because scraping is essentially the exercise of pretending to be a web browser
e92xj7s
scrapinghub
mdaniel
t1_e92xj7s
https://www.reddit.com/r/scrapinghub/comments/9tw1sa/using_octoparse_to_continuously_scrape_bitly_data/e92xj7s/
11/5/2018 5:27:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
e923amo
t1_e923amo
e923amo
1
9tw1sa
False
False
False
2
5
5
5
3
2.7027027027027
0
0
0
0
59
53.1531531531532
111
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
365
RepliedTo
11/5/2018 11:09:44 AM
Is this something I would have to do in python? Octoparse doesn't really give you a whole lot of options. It's more of a friendly UI tool to quickly parse HTML
e937p0a
scrapinghub
Black_Magic100
t1_e937p0a
https://www.reddit.com/r/scrapinghub/comments/9tw1sa/using_octoparse_to_continuously_scrape_bitly_data/e937p0a/
11/5/2018 11:09:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
e92xj7s
t1_e92xj7s
e92xj7s
1
9tw1sa
True
False
False
3
9
5
5
1
3.2258064516129
0
0
0
0
14
45.1612903225806
31
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
364
RepliedTo
11/6/2018 4:44:50 AM
If octoparse is just for HTML, then yes: it's not the correct tool for dealing with that data.
Maybe try the mobile version of their site (if such a thing exists), since they tend to be more "plain" html and less wizardry
e9553py
scrapinghub
mdaniel
t1_e9553py
https://www.reddit.com/r/scrapinghub/comments/9tw1sa/using_octoparse_to_continuously_scrape_bitly_data/e9553py/
11/6/2018 4:44:50 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
e937p0a
t1_e937p0a
e937p0a
1
9tw1sa
False
False
False
4
5
5
5
1
2.38095238095238
0
0
0
0
20
47.6190476190476
42
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
363
RepliedTo
11/6/2018 11:06:41 AM
I don't think that that would change the fact that it uses AJAX to load new data. I'm trying to figure out how to do it in python
e95i305
scrapinghub
Black_Magic100
t1_e95i305
https://www.reddit.com/r/scrapinghub/comments/9tw1sa/using_octoparse_to_continuously_scrape_bitly_data/e95i305/
11/6/2018 11:06:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
e9553py
t1_e9553py
e9553py
0
9tw1sa
True
False
False
5
9
5
5
0
0
0
0
0
0
10
35.7142857142857
28
128, 128, 128
3
Solid
50
No
362
Posted
6/19/2020 4:11:20 PM
きららの作品の各号の出現数をWebサイトから取得して調べたかったんだけどプログラミングは素人に毛が生えたレベルだったから苦労した
正規表現とかXPathとかむじゅかしかった・・・これをよく理解すれば将来の仕事効率も上がるんだろうけどなあ
hc3h84
lowlevelaware
mao1756
t3_hc3h84
https://www.reddit.com/r/lowlevelaware/comments/hc3h84/2日間の格闘の末僕はoctoparseというソフトの使い方がなんとなくわかった/
6/19/2020 4:11:20 PM
1/1/0001 12:00:00 AM
False
False
3
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
2日間の格闘の末僕はoctoparseというソフトの使い方がなんとなくわかった
False
0.81
hc3h84
0
1
60
60
0
0
0
0
0
0
3
100
3
128, 128, 128
3
Solid
50
No
361
Commented
6/20/2020 7:59:01 AM
よくこんなの見つけてきたなあ
fvf7la3
lowlevelaware
Hib3
t1_fvf7la3
https://www.reddit.com/r/lowlevelaware/comments/hc3h84/2日間の格闘の末僕はoctoparseというソフトの使い方がなんとなくわかった/fvf7la3/
6/20/2020 7:59:01 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hc3h84
t3_hc3h84
hc3h84
0
hc3h84
False
False
False
0
1
60
60
0
0
0
0
0
0
1
100
1
128, 128, 128
3
Solid
50
No
359
Commented
9/8/2020 1:38:26 PM
I think you can try extracting using regex. It should be simple (I guess).
g4fufaz
webscraping
devildaniii
t1_g4fufaz
https://www.reddit.com/r/webscraping/comments/iolqnx/extracting_emails_result_in_jumbled_text/g4fufaz/
9/8/2020 1:38:26 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iolqnx
t3_iolqnx
iolqnx
0
iolqnx
False
False
False
0
1
43
43
0
0
0
0
0
0
7
50
14
128, 128, 128
3
Solid
50
No
360
Posted
9/8/2020 2:50:28 AM
I'm trying to extract the emails from this directory, but they come out jumbled via octoparse:
http://members.calbar.ca.gov/fal/LicenseeSearch/AdvancedSearch?LastNameOption=b&LastName=&FirstNameOption=b&FirstName=&MiddleNameOption=b&MiddleName=&FirmNameOption=b&FirmName=&CityOption=b&City=&State=&Zip=&District=&County=&LegalSpecialty=02&LanguageSpoken=
For instance, wujcigh@itqm.net
uotlsu@wsowkw.org
You get the picture. does anyone have a solution, or could I pay someone to do it for me? DM pls.
iolqnx
webscraping
tranzadikt
t3_iolqnx
https://www.reddit.com/r/webscraping/comments/iolqnx/extracting_emails_result_in_jumbled_text/
9/8/2020 2:50:28 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Extracting Emails result in Jumbled Text
False
1
iolqnx
0
1
43
43
0
0
0
0
0
0
21
46.6666666666667
45
128, 128, 128
3
Solid
50
No
358
Commented
9/8/2020 3:15:44 AM
Not a tough problem. DM if you need help.
g4eobx7
webscraping
rakesh3368
t1_g4eobx7
https://www.reddit.com/r/webscraping/comments/iolqnx/extracting_emails_result_in_jumbled_text/g4eobx7/
9/8/2020 3:15:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iolqnx
t3_iolqnx
iolqnx
0
iolqnx
False
False
False
0
1
43
43
1
11.1111111111111
1
11.1111111111111
0
0
3
33.3333333333333
9
128, 128, 128
3
Solid
50
No
356
Commented
7/3/2022 3:59:37 PM
[Yellowpages.com](https://Yellowpages.com) is actually pretty easy to scrape with a tiny bit of python knowledge.
I wrote a complete tutorial here [How to Scrape YellowPages.com](https://scrapfly.io/blog/how-to-scrape-yellowpages/) but to summarize:
All we need is two Python packages. One for downloading the pages and another one for parsing them. For example, I highly recommend httpx and parsel packages which can be installed with `pip install httpx parsel` terminal command. After that, we can import these packages in our python script and scrape ahead!
```python
from parsel import Selector
import httpx
response = httpx.get("https://www.yellowpages.com/san-francisco-ca/mip/ozumo-japanese-restaurant-8083027")
tree = Selector(text=response.text)
print({
"name": tree.css("h1.business-name::text").get(),
"phone": tree.css("#main-aside .phone>strong::text").get(),
"website": tree.css("#main-aside .website-link::attr(href)").get(),
"address": tree.css("#main-aside .address ::text").get(),
})
```
The script ahead retrieves the name, phone, website, and address of a business.
iepkbsm
webscraping
scrapecrow
t1_iepkbsm
https://www.reddit.com/r/webscraping/comments/suex8l/scrape_data_searching_from_csv_list/iepkbsm/
7/3/2022 3:59:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
suex8l
t3_suex8l
suex8l
0
suex8l
False
False
False
0
1
10
10
4
2.38095238095238
0
0
0
0
107
63.6904761904762
168
128, 128, 128
3
Solid
50
No
357
Posted
2/17/2022 3:37:07 AM
I am trying to scrape data from yellowpages. I have a list of businesses with their name and location. To search yellowpages.com you need to input the business name, and also the location (city&state). I have a csv with the names of every business I want to search with their city and state. Do any of the no-code web scraping tools like scrapestorm, octoparse, parsehub, etc., allow you to give a list of terms to enter into a search field like this, one after another, and get the data? The url structure of [yellowpages.com](https://yellowpages.com) does not allow for simple url modifications to move through them.
suex8l
webscraping
JFiney
t3_suex8l
https://www.reddit.com/r/webscraping/comments/suex8l/scrape_data_searching_from_csv_list/
2/17/2022 3:37:07 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape data searching from csv list?
False
1
suex8l
0
1
10
10
0
0
0
0
0
0
50
44.2477876106195
113
128, 128, 128
3
Solid
50
No
355
Commented
2/18/2022 9:54:14 PM
>The url structure of yellowpages.com does not allow for simple url modifications to move through them.
Well, technically if you have the Business Name, City, State in columns A, B and C, you can write a formula in Excel such as:
="https://www.yellowpages.com/search?search\_terms="&A1&"&geo\_location\_terms="&B1&"%2C+"&C1
and you'll get a link with the search results of each but then you still need to do something with it.
I've tried no-code options but I don't know if I've come across any that allows you to provide a list or file to direct the scraping tool.
A Python script to pull in the first record wouldn't be too difficult to put together but it would be helpful to get an idea of which data you want to return.
hxi2fsx
webscraping
sudodoyou
t1_hxi2fsx
https://www.reddit.com/r/webscraping/comments/suex8l/scrape_data_searching_from_csv_list/hxi2fsx/
2/18/2022 9:54:14 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
suex8l
t3_suex8l
suex8l
0
suex8l
False
False
False
0
1
10
10
3
2.18978102189781
1
0.72992700729927
0
0
57
41.6058394160584
137
128, 128, 128
3
Solid
50
No
354
Commented
2/17/2022 6:27:57 AM
I don't know about the no-code services... but depending on what you need I could just write and/or pull it for you. How many pages of results are you trying to get per business?
hxa0eae
webscraping
i_am_extra_syrup
t1_hxa0eae
https://www.reddit.com/r/webscraping/comments/suex8l/scrape_data_searching_from_csv_list/hxa0eae/
2/17/2022 6:27:57 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
suex8l
t3_suex8l
suex8l
0
suex8l
False
False
False
0
1
10
10
0
0
0
0
0
0
11
30.5555555555556
36
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
697
Posted
12/27/2016 8:49:26 AM
http://www.octoparse.com/tutorial/guidelines-for-the-use-of-cloud-service-in-octoparse-1/?category=CLOUDEXTRACTION
5kisw8
datamining
paulblack2025
t3_5kisw8
https://www.reddit.com/r/datamining/comments/5kisw8/the_best_guidelines_to_use_cloud_service_on/
12/27/2016 8:49:26 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
The best guidelines to use cloud service on Octoparse
False
1
5kisw8
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
696
Posted
10/19/2016 3:19:17 AM
[removed]
5887fv
startups
paulblack2025
t3_5887fv
https://www.reddit.com/r/startups/comments/5887fv/1_automated_web_scraping_software_no_programming/
10/19/2016 3:19:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
#1 Automated Web Scraping Software · No Programming Needed · Free · Octoparse
False
1
5887fv
0
400
9
9
0
0
0
0
0
0
1
100
1
128, 128, 128
3
Solid
50
No
695
Commented
10/19/2016 3:19:17 AM
Your post has been removed because /r/startups requires at least 200 characters of discussion and commentary with any self post.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/startups) if you have any questions or concerns.*
d8y9vzq
startups
AutoModerator
t1_d8y9vzq
https://www.reddit.com/r/startups/comments/5887fv/1_automated_web_scraping_software_no_programming/d8y9vzq/
10/19/2016 3:19:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
5887fv
t3_5887fv
5887fv
0
5887fv
False
False
False
0
1
9
9
0
0
1
2
0
0
20
40
50
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
694
Posted
6/13/2016 9:01:32 AM
http://www.octoparse.com/
4nusw1
bigdata_analytics
paulblack2025
t3_4nusw1
https://www.reddit.com/r/bigdata_analytics/comments/4nusw1/what_is_octoparse/
6/13/2016 9:01:32 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What is Octoparse?
False
1
4nusw1
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
693
Posted
1/22/2017 9:12:16 AM
https://www.reddit.com/user/paulblack2025/m/octoparse/
5pg4ig
multihub
paulblack2025
t3_5pg4ig
https://www.reddit.com/r/multihub/comments/5pg4ig/1_web_scraping_service_free_data_extraction/
1/22/2017 9:12:16 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
#1 Web Scraping Service & Free Data Extraction Tool|Octoparse, Free Web Scraping
False
1
5pg4ig
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
692
Posted
11/2/2016 7:56:31 AM
http://www.octoparse.com/blog/10-best-free-tools-for-startups-octoparse/
5aop5o
BusinessIntelligence
paulblack2025
t3_5aop5o
https://www.reddit.com/r/BusinessIntelligence/comments/5aop5o/10_best_free_tools_for_startups_octoparse/
11/2/2016 7:56:31 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10 Best Free Tools for Startups - Octoparse
False
1
5aop5o
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
691
Posted
12/21/2016 3:19:37 AM
http://www.octoparse.com/faq/unable-to-connect-to-octoparse/
5jhp8x
datamining
paulblack2025
t3_5jhp8x
https://www.reddit.com/r/datamining/comments/5jhp8x/how_to_fix_unable_to_connect_to_octoparse_error/
12/21/2016 3:19:37 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to fix "Unable to connect to Octoparse" error?
False
1
5jhp8x
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
690
Posted
3/15/2017 7:47:42 AM
http://www.octoparse.com/octoparse-anniversary-sale/
5zi2ay
datamining
paulblack2025
t3_5zi2ay
https://www.reddit.com/r/datamining/comments/5zi2ay/up_to_35_off_octoparse_1st_anniversary_sale_give/
3/15/2017 7:47:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
UP TO 35% OFF! - Octoparse 1st Anniversary Sale GIVE BACK
False
1
5zi2ay
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
689
Posted
6/13/2016 9:12:58 AM
http://www.octoparse.com/product/
4nutwj
datascience
paulblack2025
t3_4nutwj
https://www.reddit.com/r/datascience/comments/4nutwj/octoparse_a_free_web_scraping_tool/
6/13/2016 9:12:58 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse, a free web scraping tool.
False
0.67
4nutwj
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
688
Posted
12/22/2016 8:56:28 AM
https://www.youtube.com/watch?v=Cw32LNpzxFk
5jpmf9
eFreebies
paulblack2025
t3_5jpmf9
https://www.reddit.com/r/eFreebies/comments/5jpmf9/how_to_use_octoparse_to_make_a_crawler/
12/22/2016 8:56:28 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Use Octoparse to Make a Crawler?
False
1
5jpmf9
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
687
Posted
12/22/2016 9:11:19 AM
https://www.youtube.com/attribution_link?a=tUkTWZtePJI&u=%2Fwatch%3Fv%3DCw32LNpzxFk%26feature%3Dshare
5jpo40
dataisbeautiful
paulblack2025
t3_5jpo40
https://www.reddit.com/r/dataisbeautiful/comments/5jpo40/octoparse_tutorial_how_to_use_octoparse_to_make_a/
12/22/2016 9:11:19 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse Tutorial: How to Use Octoparse to Make a Crawler?
False
1
5jpo40
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
686
Posted
12/30/2016 2:41:13 AM
http://www.octoparse.com/tutorial/10-essential-tutorials-that-every-octoparse-newbie-should-know/?category=OTHERS
5l0zz6
datamining
paulblack2025
t3_5l0zz6
https://www.reddit.com/r/datamining/comments/5l0zz6/10_essential_tutorials_that_every_octoparse/
12/30/2016 2:41:13 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10 Essential Tutorials That Every Octoparse Newbie Should Know
False
1
5l0zz6
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
685
Posted
12/19/2016 7:24:25 AM
[removed]
5j558l
learnprogramming
paulblack2025
t3_5j558l
https://www.reddit.com/r/learnprogramming/comments/5j558l/6_tips_to_use_the_web_scraping_tool_octoparse/
12/19/2016 7:24:25 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
6 Tips to Use the Web Scraping Tool Octoparse
False
1
5j558l
0
400
9
9
0
0
0
0
0
0
1
100
1
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
684
Posted
12/20/2016 8:22:08 AM
[removed]
5jc6nj
techsupport
paulblack2025
t3_5jc6nj
https://www.reddit.com/r/techsupport/comments/5jc6nj/scrape_data_from_online_accommodation_booking/
12/20/2016 8:22:08 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape Data from Online Accommodation Booking Sites|Octoparse
False
1
5jc6nj
0
400
9
9
0
0
0
0
0
0
1
100
1
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
683
Posted
10/19/2016 3:34:42 AM
[removed]
5889w6
startups
paulblack2025
t3_5889w6
https://www.reddit.com/r/startups/comments/5889w6/octoparse_free_visual_web_scraping_software_turn/
10/19/2016 3:34:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse- Free Visual Web Scraping Software · Turn Unstructured Data into Structured Data Sets
False
1
5889w6
0
400
9
9
0
0
0
0
0
0
1
100
1
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
682
Posted
10/19/2016 11:18:23 AM
http://www.octoparse.com/
589qdn
IMadeThis
paulblack2025
t3_589qdn
https://www.reddit.com/r/IMadeThis/comments/589qdn/octoparse_a_visual_web_scraping_software/
10/19/2016 11:18:23 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse - A Visual Web Scraping Software
False
1
589qdn
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
681
Posted
3/15/2017 7:28:50 AM
https://i.redd.it/lj99cnjnxily.jpg
5zi079
pics
paulblack2025
t3_5zi079
https://www.reddit.com/r/pics/comments/5zi079/octoparse_anniversary_give_back_is_live_now/
3/15/2017 7:28:50 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse Anniversary Give Back is Live now!!
False
1
5zi079
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
680
Posted
3/20/2017 10:38:46 AM
http://www.octoparse.com/
60fr93
Underdog_Promotions
paulblack2025
t3_60fr93
https://www.reddit.com/r/Underdog_Promotions/comments/60fr93/octoparse_automated_web_scraping_tool_greatly/
3/20/2017 10:38:46 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse: Automated web scraping tool. Greatly save your time!!
False
1
60fr93
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
679
Posted
8/4/2017 3:41:42 AM
http://www.octoparse.com/blog/octoparse-vs-importio-comparison-which-is-best-for-web-scraping/
6rhxwy
datamining
paulblack2025
t3_6rhxwy
https://www.reddit.com/r/datamining/comments/6rhxwy/octoparse_vs_importio_comparison_which_is_better/
8/4/2017 3:41:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse vs. Import.io comparison: which is better for web scraping?
False
1
6rhxwy
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
678
Posted
6/13/2016 9:01:32 AM
http://www.octoparse.com/
4nusw1
bigdata_analytics
paulblack2025
t3_4nusw1
https://www.reddit.com/r/bigdata_analytics/comments/4nusw1/what_is_octoparse/
6/13/2016 9:01:32 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What is Octoparse?
False
1
4nusw1
0
400
9
9
135, 121, 121
3.37748344370861
Dash Dot Dot
48.3822138126774
No
677
Posted
12/21/2016 3:00:51 AM
http://www.octoparse.com/faq/octoparse-free-trial-issues/
5jhm0l
datamining
paulblack2025
t3_5jhm0l
https://www.reddit.com/r/datamining/comments/5jhm0l/how_to_apply_for_an_octoparse_free_trial/
12/21/2016 3:00:51 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Apply for an Octoparse Free Trial?
False
1
5jhm0l
0
400
9
9
128, 128, 128
3
Solid
50
No
353
Commented
6/14/2016 10:25:32 PM
Don't forget what happened to kimono
d49evok
datascience
lieutenant_lowercase
t1_d49evok
https://www.reddit.com/r/datascience/comments/4nutwj/octoparse_a_free_web_scraping_tool/d49evok/
6/14/2016 10:25:32 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
4nutwj
t3_4nutwj
4nutwj
1
4nutwj
False
False
False
0
1
9
9
0
0
0
0
0
0
3
50
6
128, 128, 128
3
Solid
50
Yes
352
RepliedTo
6/15/2016 5:02:08 PM
what happened?
d4ae56x
datascience
sasjkh3333
t1_d4ae56x
https://www.reddit.com/r/datascience/comments/4nutwj/octoparse_a_free_web_scraping_tool/d4ae56x/
6/15/2016 5:02:08 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
d49evok
t1_d49evok
d49evok
1
4nutwj
False
False
False
1
1
9
9
0
0
0
0
0
0
1
50
2
128, 128, 128
3
Solid
50
Yes
351
RepliedTo
6/15/2016 6:51:43 PM
Got bought out and closed down leaving everyone who relied on it screwed. It's relatively trivial to code a scraper so I would never use an automated scraping tool
d4ajium
datascience
lieutenant_lowercase
t1_d4ajium
https://www.reddit.com/r/datascience/comments/4nutwj/octoparse_a_free_web_scraping_tool/d4ajium/
6/15/2016 6:51:43 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
d4ae56x
t1_d4ae56x
d4ae56x
0
4nutwj
False
False
False
2
1
9
9
0
0
2
6.89655172413793
0
0
14
48.2758620689655
29
128, 128, 128
3
Solid
50
No
350
Commented
3/8/2022 6:10:09 PM
I think salesblink is also best for email marketing. I used it.
hzv7dx4
Octoparse_ideas
Inevitable-Dish-732
t1_hzv7dx4
https://www.reddit.com/r/Octoparse_ideas/comments/t8ljkf/how_to_conduct_b2b_lead_generation_10_tips_and/hzv7dx4/
3/8/2022 6:10:09 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
t8ljkf
t3_t8ljkf
t8ljkf
0
t8ljkf
False
False
False
0
1
3
3
1
8.33333333333333
0
0
0
0
5
41.6666666666667
12
128, 128, 128
3
Solid
50
No
348
Commented
5/2/2022 4:46:58 PM
[https://www.bing.com/packagetrackingv2?packNum=](https://www.bing.com/packagetrackingv2?packNum=)<tracking number>&carrier=<FedEx or UPS>
replace the <tracking number> with the tracking number and <FedEx or UPS> with either FedEx or UPS
i71p9r6
webscraping
Mehpew
t1_i71p9r6
https://www.reddit.com/r/webscraping/comments/b07zt9/trying_to_use_webscraping_to_track_fedex_and_ups/i71p9r6/
5/2/2022 4:46:58 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
b07zt9
t3_b07zt9
b07zt9
1
b07zt9
False
False
False
0
1
30
30
0
0
0
0
0
0
20
57.1428571428571
35
128, 128, 128
3
Solid
50
No
347
RepliedTo
10/6/2022 2:49:53 AM
Yooooo thanks for this. This shit saved my ass at work.
ir8hb47
webscraping
its_kiddos
t1_ir8hb47
https://www.reddit.com/r/webscraping/comments/b07zt9/trying_to_use_webscraping_to_track_fedex_and_ups/ir8hb47/
10/6/2022 2:49:53 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
i71p9r6
t1_i71p9r6
i71p9r6
0
b07zt9
False
False
False
1
1
30
30
1
9.09090909090909
1
9.09090909090909
0
0
4
36.3636363636364
11
128, 128, 128
3
Solid
50
No
349
Posted
3/12/2019 1:49:21 PM
Right now I am trying to use octoparse to input FedEx and UPS tracking codes then grab the time of arrival then just give me that data. Does anyone know of if it is possible to make a script that inputs the tracking numbers into ups/FedEx automatically to scrape data from octoparse? Or if there is any program that can do this. We shipping hundreds of things a day. Any help would be appreciated.
b07zt9
webscraping
SoatzoTakanoshi
t3_b07zt9
https://www.reddit.com/r/webscraping/comments/b07zt9/trying_to_use_webscraping_to_track_fedex_and_ups/
3/12/2019 1:49:21 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Trying to use webscraping to track FedEx and UPS packages in Octoparse!
False
1
b07zt9
0
1
30
30
2
2.66666666666667
0
0
0
0
33
44
75
128, 128, 128
3
Solid
50
No
346
Commented
3/19/2019 10:55:22 PM
Use FedEx and ups apis
eix0cao
webscraping
bobbysteel
t1_eix0cao
https://www.reddit.com/r/webscraping/comments/b07zt9/trying_to_use_webscraping_to_track_fedex_and_ups/eix0cao/
3/19/2019 10:55:22 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
b07zt9
t3_b07zt9
b07zt9
0
b07zt9
False
False
False
0
1
30
30
0
0
0
0
0
0
4
80
5
128, 128, 128
3
Solid
50
Yes
343
Commented
11/22/2021 1:22:03 PM
Ces articles sont en majorité bullshit.
Le language de programmation n'influe que peu sur ton salaire, par rapport à ta société et ta séniorité.
Il est cependant plus simple d'accéder à ces sociétés généreuses, qui sont souvent des licornes/GAFA, avec les languages qu'elles utilisent en interne, c'est à dire Python, Java, ou JS/TS.
Rien ne t'empêche cependant de te spécialiser dans d'autres languages, mais il y aura souvent moins d'offres.
Pour ta seconde question : en général, tu auras un salaire 20% moins élevé en dehors de Paris, mais pareil, ça change beaucoup en fonction des sociétés. Celles qui sont habituées au remote proposent en général le même de partout.
Et sinon, niveau premier salaire, les développeurs oscillent entre les 40 et 50k.
hlmvvbe
programmation
somecroissantswe
t1_hlmvvbe
https://www.reddit.com/r/programmation/comments/qzksyo/les_langages_de_programmation_les_plus_demandés/hlmvvbe/
11/22/2021 1:22:03 PM
11/22/2021 1:29:40 PM
False
False
8
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qzksyo
t3_qzksyo
qzksyo
1
qzksyo
False
False
False
0
1
13
13
2
1.61290322580645
2
1.61290322580645
0
0
79
63.7096774193548
124
128, 128, 128
3
Solid
50
Yes
342
RepliedTo
12/17/2021 9:22:10 AM
Ooop j'ai oublié!! 1000+ merci pour votre réponse !
howbx8g
programmation
nanami2977
t1_howbx8g
https://www.reddit.com/r/programmation/comments/qzksyo/les_langages_de_programmation_les_plus_demandés/howbx8g/
12/17/2021 9:22:10 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hlmvvbe
t1_hlmvvbe
hlmvvbe
0
qzksyo
True
False
False
1
1
13
13
0
0
0
0
0
0
7
87.5
8
128, 128, 128
3
Solid
50
No
341
Posted
5/2/2022 5:19:12 PM
https://presoftsol.com/octoparse-crack-free-download/#.YnASZdWLzFY.reddit
uguoba
videos
Educational_Big_3934
t3_uguoba
https://www.reddit.com/r/videos/comments/uguoba/octoparse_850_crack_free_download_2022_pre_soft/
5/2/2022 5:19:12 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse 8.5.0 Crack Free Download 2022 - Pre Soft Sol
False
1
uguoba
0
1
1
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
336
Commented
6/25/2021 2:34:36 PM
Twitter has an API that requires a registration but allows to download 500.000 tweets per month for free.
h2zzikm
Entrepreneur
geezeer84
t1_h2zzikm
https://www.reddit.com/r/Entrepreneur/comments/o7lzen/twitter_bio_scraper_to_identify_microinfluencers/h2zzikm/
6/25/2021 2:34:36 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o7lzen
t3_o7lzen
o7lzen
1
o7lzen
False
False
False
0
4
61
61
1
5.26315789473684
0
0
0
0
10
52.6315789473684
19
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
335
RepliedTo
6/25/2021 4:09:04 PM
The official Twitter API is definitely a good idea for many use cases. I dont like its results compared to results from web UI when you performing a user search.
h30c1hb
Entrepreneur
jaypat87
t1_h30c1hb
https://www.reddit.com/r/Entrepreneur/comments/o7lzen/twitter_bio_scraper_to_identify_microinfluencers/h30c1hb/
6/25/2021 4:09:04 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h2zzikm
t1_h2zzikm
h2zzikm
0
o7lzen
True
False
False
1
4
61
61
1
3.33333333333333
0
0
0
0
17
56.6666666666667
30
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
334
Commented
6/25/2021 2:34:36 PM
Twitter has an API that requires a registration but allows to download 500.000 tweets per month for free.
h2zzikm
Entrepreneur
geezeer84
t1_h2zzikm
https://www.reddit.com/r/Entrepreneur/comments/o7lzen/twitter_bio_scraper_to_identify_microinfluencers/h2zzikm/
6/25/2021 2:34:36 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o7lzen
t3_o7lzen
o7lzen
1
o7lzen
False
False
False
0
4
61
61
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
333
RepliedTo
6/25/2021 4:09:04 PM
The official Twitter API is definitely a good idea for many use cases. I dont like its results compared to results from web UI when you performing a user search.
h30c1hb
Entrepreneur
jaypat87
t1_h30c1hb
https://www.reddit.com/r/Entrepreneur/comments/o7lzen/twitter_bio_scraper_to_identify_microinfluencers/h30c1hb/
6/25/2021 4:09:04 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h2zzikm
t1_h2zzikm
h2zzikm
0
o7lzen
True
False
False
1
4
61
61
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
No
340
Posted
6/25/2021 11:40:12 AM
Micro-influencers based marketing strategies are quickly becoming established methods in the post covid era. Micro-influencers are people on Twitter (and Instagram) who have 1000-100,000 followers and are very active in a niche, and typically engage with their audience more than huge celebrity influencers.
Twitter search bar is a great way to hunt for these micro-influencers. However, it is a pain to export the information as a CSV file and sort on the basis on follower count, number of tweets etc.
If you are a coder than you might be able to put together something that works, however, scraping Twitter is tricky due to their strong anti-bot protections so it will require you to implement something like IP proxy rotation, CAPTCHA solving etc.
The ways I identified how to get this data is:
1) Pay a freelancer from Fiverr/Upwork. The quality and cost vary according the freelancer selected. It should cost atleast $200-$400 in freelancer fees and additional fees to pay for server, IP rotation etc. Also remember that Twitter changes its webpage layout pretty often so your scraper code may break and you will have to pay for additional changes. This option is good in case you wish to resell your Twitter data to third parties or if your usage will exceed hundred thousand requests a month.
2) Write code from scratch or hit an API from [rapidAPI](https://rapidapi.com/marketplace) or [Apify](https://apify.com/). Based on my research these cost $50/month for the cheapest plan.
3) Subscribe to a low code or no code platform like Octoparse, Scrapestorm etc. These platforms cost a bunch and you are still required to spend time on correctly configuring it, which can take hours. Phantombuster is another tool that can do the job. One advantage with these is that even though they cost over $100/month, you can scrape other others besides Twitter bio and profile info.
4) Lastly, there are dedicated tools that get Twitter bio information and nothing else. The main ones are [followerwonk.com](https://followerwonk.com), [Specrom's Twitter bio scraper](https://www.specrom.com/twitter-bio-scraper/), and [https://www.exportdata.io/](https://www.exportdata.io/) and [https://www.vicinitas.io/free-tools/download-twitter-followers](https://www.vicinitas.io/free-tools/download-twitter-followers)
One of the things I have realized is that some tools ask for your Twitter login details. This is a bad idea since Twitter can track how many pages/search queries you have requested and they can block your account for suspicious activity.
Another thing to check is that whether it supports advanced twitter search queries so that you can simply search for beauty influencers near Atlanta by "beauty influencer" near:Atlanta within:15mi.
From a cost perspective, I have seen these tools go for anywhere between $20-30 a month. However, there is an [Appsumo deal going on for $39/year](https://appsumo.com/products/marketplace-twitter-bio-scraper/).
I would love to hear about how everyone else here is finding relevant users and micro-influencers.
o7lzen
Entrepreneur
jaypat87
t3_o7lzen
https://www.reddit.com/r/Entrepreneur/comments/o7lzen/twitter_bio_scraper_to_identify_microinfluencers/
6/25/2021 11:40:12 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Twitter bio Scraper to identify micro-influencers, search and export users and profile info as CSV file
False
0.33
o7lzen
0
8
61
61
17
3.30739299610895
6
1.16731517509728
0
0
261
50.7782101167315
514
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
No
339
Commented
6/25/2021 11:48:52 AM
I just want to put in a disclaimer saying that I am a creator of one of the Twitter Bio scraper tool mentioned above. I have tried to limit self promotion to a minimum and gave shoutouts to all competing products for fairness and reducing any bias. If the mods feel like it was excessive than I will be happy to pull down all the links.
Having said that, I genuinely want to know about any tools/ways people here are identifying and searching for micro-influencer and exporting that info in their prospecting workflows.
h2zhih0
Entrepreneur
jaypat87
t1_h2zhih0
https://www.reddit.com/r/Entrepreneur/comments/o7lzen/twitter_bio_scraper_to_identify_microinfluencers/h2zhih0/
6/25/2021 11:48:52 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o7lzen
t3_o7lzen
o7lzen
0
o7lzen
True
False
False
0
8
61
61
2
2.10526315789474
3
3.15789473684211
0
0
39
41.0526315789474
95
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
No
338
Posted
6/25/2021 11:40:12 AM
Micro-influencers based marketing strategies are quickly becoming established methods in the post covid era. Micro-influencers are people on Twitter (and Instagram) who have 1000-100,000 followers and are very active in a niche, and typically engage with their audience more than huge celebrity influencers.
Twitter search bar is a great way to hunt for these micro-influencers. However, it is a pain to export the information as a CSV file and sort on the basis on follower count, number of tweets etc.
If you are a coder than you might be able to put together something that works, however, scraping Twitter is tricky due to their strong anti-bot protections so it will require you to implement something like IP proxy rotation, CAPTCHA solving etc.
The ways I identified how to get this data is:
1) Pay a freelancer from Fiverr/Upwork. The quality and cost vary according the freelancer selected. It should cost atleast $200-$400 in freelancer fees and additional fees to pay for server, IP rotation etc. Also remember that Twitter changes its webpage layout pretty often so your scraper code may break and you will have to pay for additional changes. This option is good in case you wish to resell your Twitter data to third parties or if your usage will exceed hundred thousand requests a month.
2) Write code from scratch or hit an API from [rapidAPI](https://rapidapi.com/marketplace) or [Apify](https://apify.com/). Based on my research these cost $50/month for the cheapest plan.
3) Subscribe to a low code or no code platform like Octoparse, Scrapestorm etc. These platforms cost a bunch and you are still required to spend time on correctly configuring it, which can take hours. Phantombuster is another tool that can do the job. One advantage with these is that even though they cost over $100/month, you can scrape other others besides Twitter bio and profile info.
4) Lastly, there are dedicated tools that get Twitter bio information and nothing else. The main ones are [followerwonk.com](https://followerwonk.com), [Specrom's Twitter bio scraper](https://www.specrom.com/twitter-bio-scraper/), and [https://www.exportdata.io/](https://www.exportdata.io/) and [https://www.vicinitas.io/free-tools/download-twitter-followers](https://www.vicinitas.io/free-tools/download-twitter-followers)
One of the things I have realized is that some tools ask for your Twitter login details. This is a bad idea since Twitter can track how many pages/search queries you have requested and they can block your account for suspicious activity.
Another thing to check is that whether it supports advanced twitter search queries so that you can simply search for beauty influencers near Atlanta by "beauty influencer" near:Atlanta within:15mi.
From a cost perspective, I have seen these tools go for anywhere between $20-30 a month. However, there is an [Appsumo deal going on for $39/year](https://appsumo.com/products/marketplace-twitter-bio-scraper/).
I would love to hear about how everyone else here is finding relevant users and micro-influencers.
o7lzen
Entrepreneur
jaypat87
t3_o7lzen
https://www.reddit.com/r/Entrepreneur/comments/o7lzen/twitter_bio_scraper_to_identify_microinfluencers/
6/25/2021 11:40:12 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Twitter bio Scraper to identify micro-influencers, search and export users and profile info as CSV file
False
0.33
o7lzen
0
8
61
61
128, 128, 128
3.00662251655629
Dash Dot Dot
49.9716177861873
No
337
Commented
6/25/2021 11:48:52 AM
I just want to put in a disclaimer saying that I am a creator of one of the Twitter Bio scraper tool mentioned above. I have tried to limit self promotion to a minimum and gave shoutouts to all competing products for fairness and reducing any bias. If the mods feel like it was excessive than I will be happy to pull down all the links.
Having said that, I genuinely want to know about any tools/ways people here are identifying and searching for micro-influencer and exporting that info in their prospecting workflows.
h2zhih0
Entrepreneur
jaypat87
t1_h2zhih0
https://www.reddit.com/r/Entrepreneur/comments/o7lzen/twitter_bio_scraper_to_identify_microinfluencers/h2zhih0/
6/25/2021 11:48:52 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o7lzen
t3_o7lzen
o7lzen
0
o7lzen
True
False
False
0
8
61
61
128, 128, 128
3
Solid
50
No
332
Posted
4/18/2022 4:13:23 PM
I have been doing web scraping for over 5 years, working with companies both large and small and had the chance to use many tools. In fact, I personally like exploring new tools all the time.
With the overwhelming data available on the internet, web scraping has become an essential approach to generate a dataset for input to your decision engine.
For e.g, while working with an eCommerce client, who asked us to extract data from a list of over 1000 websites, I used Python Scrapy Framework. I decided to use Scrapy Framework because it is fast as compared to other libraries like Beautiful Soup. Scrapy framework allowed us to configure the expected output, proxy settings, middleware settings, etc which allowed us to complete the project on time.
What I am trying to point is there are different tools required for different needs and could be best in their own way.
Based on my experience of working with various clients
I have put together a list of the tools that I’ve used in the past and that have proven to be successful in helping achieve the desired outcomes.
Here is the list of best web scraping tools of 2022 that are used for scraping data.
&#x200B;
* **Scraper API**
Scraper API is a proxy API for web scraping handling proxies, browsers and CAPTCHAs that allows developers to get the HTML from any page with a simple API call. It has a great pool of proxies that supports eCommerce price scraping, search engine scraping, social media scraping and more!
It comes with a 7-day free trial with 5000 free API credits and starts at $29 per month and can be customized according to the requirements.
&#x200B;
* **ScrapeSimple**
ScrapeSimple is a service that allows users to turn data from any website into a CSV file without the hassle of coding. It requires the users to provide information that they need from the website. ScrapeSimple builts a custom web scraper based on the requirements that deliver information periodically directly to their mailbox.
They have a quick response time with a user-friendly service. They accept jobs with a minimum of $250 of a monthly budget.
&#x200B;
* **Octoparse**
Octoparse is a tool that allows users to scrape data from a website with an easy-to-use interface and does not require coding. It deals with all kinds of websites, offering features like scheduled scraping, cloud service, IP rotation etc. It also features a point and click screen scraper that helps in scraping data from forms, input search terms, offers infinite scroll support and more.
It has a free plan that allows users to make upto 10 crawlers and also offers paid and custom plans.
&#x200B;
* **ParseHub**
ParseHub is an advanced power tool that allows users to build web scrapers and just like the above tools does not require any coding. It also offers features similar to Octoparse like cloud service, scheduled scraping, IP rotation and a fast turnaround time. It is a choice of many data analysts, data scientists, journalists and more.
It offers multi-OS support and has a free subscription that allows users to scrape data from 200 pages. Paid plans start from $189 dollars, and users can also customize their plans based on their requirements.
&#x200B;
* **BrightData (Luminati)**
BrightData is a data collection platform that allows you to retrieve public web data. It is one of the most reliable, flexible, and fully compliant scraping tools available. Its pricing is based on proxy infrastructure.
&#x200B;
* **AvesAPI**
AvesAPI is one of the best SERP API tools that helps in scraping structured data from google. The tool is focused on structured data scraping and is best suited for SEOs, agencies, and marketing professionals. It also offers geo-specific searches that can be helpful for agencies and professionals to find the right data.
It has a free plan which offers data of upto 1k searches and the paid plan starts at $50 per month and goes to $500 per month for their professional pack.
&#x200B;
* **Scrapy**
Scrapy is an open-source web scraping library built for python developers. It is fast and powerful and offers extensive data scraping capabilities. It is reliable and easy to use. The main attraction of using Scrapy is that it is completely free and offers a host of features. It also offers multi-OS support for Linux, Windows, Mac, and BSD.
&#x200B;
* **Diffbot**
Diffbot is another popular scraping tool focused on providing enterprise-level solutions to companies who are looking for a solution to their specified scraping needs. It has a free trial of 2 weeks and its paid plan starts at $299/per month, but its premium services and support offer great value for money for enterprises.
&#x200B;
* **ScrapingBee**
ScrapingBee is another popular tool with a USP of handling headless browsers and rotates proxies for the users. They make these claims just like other scraping tools do, and offer features like JavaScript rendering, rotating proxies, SERPs, etc.
They offer a free trial and their paid plans start from $49/per month. They also offer an enterprise plan with custom pricing based on the requirements.
&#x200B;
* **Grepsr**
Grepsr is a web scraping solution that provides its customers with end-to-end web scraping solutions. It helps in various types of data collection that can help its clients in lead generation, competitive analysis, news aggregation and financial data collection. It has a host of features for various industries and has a free plan. Its paid plan starts from $199/source and offers features specific to your needs.
&#x200B;
I hope this answer helps you to find the right data scraping tool for your organization or for your personal use.
Happy Reading!
u6gve0
u_SandeepNatoo
SandeepNatoo
t3_u6gve0
https://www.reddit.com/r/u_SandeepNatoo/comments/u6gve0/what_are_some_of_the_best_web_scraping_tools/
4/18/2022 4:13:23 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What are some of the best web scraping tools?
False
1
u6gve0
0
1
1
1
47
4.88058151609553
4
0.415368639667705
0
0
504
52.3364485981308
963
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
331
Posted
6/19/2017 9:09:27 PM
I'd like to scrape info for employees (Title, Location & other info that you can get from actually clicking on their profile page) at a certain company (example: Microsoft), after I provide log-in info and link for that company's "People." I have tried parsehub and Octoparse with no success. $25 for successful python code w comments, $5 for referral to another app / widget that can successfully do it.
6i9epa
slavelabour
Dancingrobot123
t3_6i9epa
https://www.reddit.com/r/slavelabour/comments/6i9epa/task_25_python_data_extractor_for_linkedin/
6/19/2017 9:09:27 PM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
[TASK] $25 Python data extractor for Linkedin profiles post log-in
False
0.67
6i9epa
0
4
62
62
3
4.34782608695652
0
0
0
0
30
43.4782608695652
69
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
330
Posted
6/19/2017 9:09:27 PM
I'd like to scrape info for employees (Title, Location & other info that you can get from actually clicking on their profile page) at a certain company (example: Microsoft), after I provide log-in info and link for that company's "People." I have tried parsehub and Octoparse with no success. $25 for successful python code w comments, $5 for referral to another app / widget that can successfully do it.
6i9epa
slavelabour
Dancingrobot123
t3_6i9epa
https://www.reddit.com/r/slavelabour/comments/6i9epa/task_25_python_data_extractor_for_linkedin/
6/19/2017 9:09:27 PM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
[TASK] $25 Python data extractor for Linkedin profiles post log-in
False
0.67
6i9epa
0
4
62
62
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
329
Commented
6/19/2017 9:09:34 PM
Here is my info on /u/Dancingrobot123:
| Redditor Since | Credo | Verifications | Feedback | SLRep |
|---|---|---|---|---|
| November 2016 | **no account found** ([create](https://www.credo360.com)) | N/A | N/A | **no profile found** ([create](https://www.reddit.com/r/slrep/submit?selftext=true&title=/u/Dancingrobot123 SL Network Rep Profile&text=* Redditor since [insert cake day here] %0D* Known Impersonators: [list here]%0D* Examples of my work: [add links here, preferrably Imgur links] %0D* Skills/Services: [insert text here] %0D* Number of Transactions Completed: 0)) |
*****
Got questions or comments about CredoBot? Post them [here](https://www.reddit.com/r/Credo360/comments/6gr36c)
dj4gjm1
slavelabour
CredoBot
t1_dj4gjm1
https://www.reddit.com/r/slavelabour/comments/6i9epa/task_25_python_data_extractor_for_linkedin/dj4gjm1/
6/19/2017 9:09:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
6i9epa
t3_6i9epa
6i9epa
0
6i9epa
False
False
False
0
4
62
62
1
1.01010101010101
0
0
0
0
59
59.5959595959596
99
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
328
Commented
6/19/2017 9:09:34 PM
Here is my info on /u/Dancingrobot123:
| Redditor Since | Credo | Verifications | Feedback | SLRep |
|---|---|---|---|---|
| November 2016 | **no account found** ([create](https://www.credo360.com)) | N/A | N/A | **no profile found** ([create](https://www.reddit.com/r/slrep/submit?selftext=true&title=/u/Dancingrobot123 SL Network Rep Profile&text=* Redditor since [insert cake day here] %0D* Known Impersonators: [list here]%0D* Examples of my work: [add links here, preferrably Imgur links] %0D* Skills/Services: [insert text here] %0D* Number of Transactions Completed: 0)) |
*****
Got questions or comments about CredoBot? Post them [here](https://www.reddit.com/r/Credo360/comments/6gr36c)
dj4gjm1
slavelabour
CredoBot
t1_dj4gjm1
https://www.reddit.com/r/slavelabour/comments/6i9epa/task_25_python_data_extractor_for_linkedin/dj4gjm1/
6/19/2017 9:09:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
6i9epa
t3_6i9epa
6i9epa
0
6i9epa
False
False
False
0
4
62
62
128, 128, 128
3
Solid
50
No
327
Posted
6/13/2019 6:30:11 PM
I want to collect public price info from any given website and have it displayed on a single page when I search specific items. Does anyone know where I can find a pre-built tool I can input parameters into or go about creating such program without being a website
Edit: current suggestions: octoparse
c09hp8
AskProgramming
ChickenThugs
t3_c09hp8
https://www.reddit.com/r/AskProgramming/comments/c09hp8/price_comparison/
6/13/2019 6:30:11 PM
6/13/2019 7:05:44 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Price comparison
False
1
c09hp8
0
1
1
1
0
0
0
0
0
0
30
55.5555555555556
54
128, 128, 128
3
Solid
50
No
326
Posted
9/11/2021 10:10:55 AM
Hi, dear people of r/AskProgramming
I'm looking for the best way to scrap an information from a local web site.
I want to print an information (number of sit availaible in a library) on a screen for a school project. I need to scrap it from the place's web site, since this information is publicy avalaible there.
I was looking for a solution with python or node.js but since i'm really lazy i have seen there is Scraping service like Scrapy, OctoParse ... but there is a [ton of it](https://hevodata.com/learn/8-best-web-scraping-tools/) and i don't what to think about it.
So, what is the best way to do it in your opinion ?
pm4t69
AskProgramming
Deztabilizeur
t3_pm4t69
https://www.reddit.com/r/AskProgramming/comments/pm4t69/how_to_scrap_a_public_information_on_web/
9/11/2021 10:10:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrap a public information on web ?
False
1
pm4t69
0
1
63
63
3
2.5
3
2.5
0
0
46
38.3333333333333
120
128, 128, 128
3
Solid
50
No
325
Commented
9/11/2021 12:39:56 PM
Best depends on your requirements.
The most basic way is "curl". Open a terminal, type "curl http://example.com" and download the html page that this website serves. You can then search that html file for your info with simple text search.
The second way would be to use a programmig language(and probably a libary in that language) like python and beautifull soup.
The last way would be full automation, like selenium and simulating the whole browser and all clicks etc. This will even work if the website tries to prevent scraping with JavaScript and other tricks but is a lot of overhead.
And it seems like you can pay people for that and use a webservice
hcfg5h0
AskProgramming
McMasilmof
t1_hcfg5h0
https://www.reddit.com/r/AskProgramming/comments/pm4t69/how_to_scrap_a_public_information_on_web/hcfg5h0/
9/11/2021 12:39:56 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
pm4t69
t3_pm4t69
pm4t69
0
pm4t69
False
False
False
0
1
63
63
2
1.73913043478261
0
0
0
0
54
46.9565217391304
115
128, 128, 128
3
Solid
50
No
324
Commented
5/20/2019 9:03:40 AM
I guess Trump's not a bad example given its limited vocabulary but I wonder how it would fair with more complex statements.
EDIT: given the downvotes I think necessary to mention that I'm not trying to mock Trump here, I'm not even American so I've got no horse in this race. The fact that he restricts himself to a very small set of words is acknowledged and a big part of his (clearly successful) strategy.
eo7mq9q
Python
cym13
t1_eo7mq9q
https://www.reddit.com/r/Python/comments/bqsvjl/5_steps_text_mining_and_sentiment_analysis_using/eo7mq9q/
5/20/2019 9:03:40 AM
5/20/2019 12:56:09 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
bqsvjl
t3_bqsvjl
bqsvjl
1
bqsvjl
False
False
False
0
1
12
12
4
5.33333333333333
4
5.33333333333333
0
0
29
38.6666666666667
75
128, 128, 128
3
Solid
50
No
323
RepliedTo
5/20/2019 12:51:42 PM
We were going to use your accounts, but [everybody knows you never go full retard.](https://www.youtube.com/watch?v=oAKG-kbKeIo)
eo83zjg
Python
CodeSkunky
t1_eo83zjg
https://www.reddit.com/r/Python/comments/bqsvjl/5_steps_text_mining_and_sentiment_analysis_using/eo83zjg/
5/20/2019 12:51:42 PM
1/1/0001 12:00:00 AM
False
False
-1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
eo7mq9q
t1_eo7mq9q
eo7mq9q
0
bqsvjl
False
False
False
1
1
12
12
0
0
1
4.34782608695652
0
0
12
52.1739130434783
23
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
674
Posted
4/19/2020 11:26:10 AM
I installed it the other day but haven't been able to get pass the Reminder: Failed, try again later screen.
g465al
webscraping
WhoAmITheLaw
t3_g465al
https://www.reddit.com/r/webscraping/comments/g465al/is_octoparse_down/
4/19/2020 11:26:10 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Is Octoparse down?
False
1
g465al
0
4
6
6
0
0
1
5
0
0
9
45
20
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
673
Posted
5/4/2022 12:16:04 PM
[removed]
ui57ql
webscraping
WhoAmITheLaw
t3_ui57ql
https://www.reddit.com/r/webscraping/comments/ui57ql/having_trouble_scraping_all_data_on_a_page_with/
5/4/2022 12:16:04 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Having trouble scraping all data on a page with Octoparse, would appreciate some help
False
1
ui57ql
0
4
6
6
0
0
0
0
0
0
1
100
1
128, 128, 128
3
Solid
50
Yes
671
Commented
4/19/2020 12:05:27 PM
I just trying it and I am not able to properly complete the hello world tutorial. Scrap some data and stop scraping, i thoguht that its because i set it wrong.. but maybe is down?
Anyway I am able to use the software with out problems..
fnvlt7w
webscraping
AndroidePsicokiller
t1_fnvlt7w
https://www.reddit.com/r/webscraping/comments/g465al/is_octoparse_down/fnvlt7w/
4/19/2020 12:05:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g465al
t3_g465al
g465al
1
g465al
False
False
False
0
1
6
6
1
2.17391304347826
3
6.52173913043478
0
0
15
32.6086956521739
46
128, 128, 128
3
Solid
50
Yes
670
RepliedTo
4/19/2020 2:27:04 PM
Software is the one not loading for me.
fnvwiue
webscraping
WhoAmITheLaw
t1_fnvwiue
https://www.reddit.com/r/webscraping/comments/g465al/is_octoparse_down/fnvwiue/
4/19/2020 2:27:04 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fnvlt7w
t1_fnvlt7w
fnvlt7w
0
g465al
True
False
False
1
1
6
6
0
0
0
0
0
0
3
37.5
8
128, 128, 128
3
Solid
50
Yes
322
Commented
5/4/2022 3:28:12 PM
Help request without any information are removed - if you request help from the community please take a minimum of time to explain your issue, showing code, error message, what's wrong and what do you expect.
The best way to get your issue solved is to provide a minimum reproducible code.
i7awzz0
webscraping
awebscrapingguy
t1_i7awzz0
https://www.reddit.com/r/webscraping/comments/ui57ql/having_trouble_scraping_all_data_on_a_page_with/i7awzz0/
5/4/2022 3:28:12 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ui57ql
t3_ui57ql
ui57ql
1
ui57ql
False
False
True
0
1
6
6
1
2
4
8
0
0
23
46
50
128, 128, 128
3
Solid
50
Yes
321
RepliedTo
5/6/2022 2:42:42 AM
Yes I know. I prefer to provide the details in private
i7iale4
webscraping
WhoAmITheLaw
t1_i7iale4
https://www.reddit.com/r/webscraping/comments/ui57ql/having_trouble_scraping_all_data_on_a_page_with/i7iale4/
5/6/2022 2:42:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
i7awzz0
t1_i7awzz0
i7awzz0
0
ui57ql
True
False
False
1
1
6
6
1
9.09090909090909
0
0
0
0
5
45.4545454545455
11
128, 128, 128
3
Solid
50
No
320
Commented
5/4/2022 2:56:20 PM
Not familiar with Octoparse but website might be dynamic. Certain elements are loaded using Javascript code.
i7as690
webscraping
Irrelevant-Opinion
t1_i7as690
https://www.reddit.com/r/webscraping/comments/ui57ql/having_trouble_scraping_all_data_on_a_page_with/i7as690/
5/4/2022 2:56:20 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ui57ql
t3_ui57ql
ui57ql
0
ui57ql
False
False
False
0
1
6
6
1
6.25
0
0
0
0
9
56.25
16
128, 128, 128
3
Solid
50
Yes
318
Commented
6/12/2020 10:53:59 PM
This is a paid API, right?
fund3et
whatcarshouldIbuy
collectmoments
t1_fund3et
https://www.reddit.com/r/whatcarshouldIbuy/comments/h7n6f6/template_car_research/fund3et/
6/12/2020 10:53:59 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h7n6f6
t3_h7n6f6
h7n6f6
1
h7n6f6
False
False
False
0
1
31
31
1
16.6666666666667
0
0
0
0
2
33.3333333333333
6
128, 128, 128
3
Solid
50
Yes
317
RepliedTo
6/12/2020 11:26:13 PM
Freemium I believe. I think you get the first 300 calls free or for like $1.00. I know someone who uses [sharklasers.com](https://sharklasers.com) to generate a new account if they go over API limits. Grey on ethics since it's a personal use case, but to your discretion.
funglyw
whatcarshouldIbuy
JeenyusJane
t1_funglyw
https://www.reddit.com/r/whatcarshouldIbuy/comments/h7n6f6/template_car_research/funglyw/
6/12/2020 11:26:13 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fund3et
t1_fund3et
fund3et
0
h7n6f6
True
False
False
1
1
31
31
1
1.96078431372549
1
1.96078431372549
0
0
23
45.0980392156863
51
128, 128, 128
3
Solid
50
No
316
Commented
6/12/2020 7:36:48 PM
This is cool. I will check it out later
fumqbde
whatcarshouldIbuy
1920sBusinessMan
t1_fumqbde
https://www.reddit.com/r/whatcarshouldIbuy/comments/h7n6f6/template_car_research/fumqbde/
6/12/2020 7:36:48 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h7n6f6
t3_h7n6f6
h7n6f6
0
h7n6f6
False
False
False
0
1
31
31
1
11.1111111111111
0
0
0
0
2
22.2222222222222
9
128, 128, 128
3
Solid
50
Yes
315
Commented
6/12/2020 4:19:12 PM
Very cool. Something that could be helpful is getting insurance quotes for when you are cross shopping various models.
fum0wez
whatcarshouldIbuy
theguyonabike
t1_fum0wez
https://www.reddit.com/r/whatcarshouldIbuy/comments/h7n6f6/template_car_research/fum0wez/
6/12/2020 4:19:12 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h7n6f6
t3_h7n6f6
h7n6f6
1
h7n6f6
False
False
False
0
1
31
31
2
10.5263157894737
0
0
0
0
9
47.3684210526316
19
128, 128, 128
3
Solid
50
Yes
314
RepliedTo
6/12/2020 4:30:42 PM
Totally, I think you could add that once you key in on the cars you're really looking to buy
fum2jic
whatcarshouldIbuy
JeenyusJane
t1_fum2jic
https://www.reddit.com/r/whatcarshouldIbuy/comments/h7n6f6/template_car_research/fum2jic/
6/12/2020 4:30:42 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fum0wez
t1_fum0wez
fum0wez
0
h7n6f6
True
False
False
1
1
31
31
0
0
0
0
0
0
9
47.3684210526316
19
128, 128, 128
3
Solid
50
No
319
Posted
6/12/2020 3:48:39 PM
TL;DR: [I built a spreadsheet that uses MarketCheck's API to help you research cars.](https://airtable.com/universe/expxibkgRqb4gTli4/car-buyers-research?explore=true) [Here's a vid explainer of how to use it.](https://share.getcloudapp.com/P8uejDky)
I'm thinking about getting a hoopty, and wanted to see what my options were regarding:
* Availability/Inventory
* Price
* Miles
As a first pass, I didn't really care about looking for a specific make and model car, but more-so what's available in my price range. So I built an [Airtable Base](https://airtable.com/universe/expxibkgRqb4gTli4/car-buyers-research?explore=true) (better than a spreadsheet) and connected it to MarketCheck's API to pull in all available car listings that you'd see on something like [cars.com](https://cars.com). It pulls in \[Make, Model, Year, VIN, Price, Miles, Photos, Listing URL\]
I have a separate Cars tab to show me which cars by Make-Model-Year (MMY) have listings. (Ex: a 2004 Toyota Camry can have 15 listings). Now I can use the Cars tab to see things like Average Price, Average Mileage for that car, and add links to car forums and reviews for that MMY. The reason I find this most valuable is that I can also drop in notes about things to look out for with that MMY, like recalls or *usual suspects* that typically need to be replaced when a car hits a certain milage.
It's a lot of fun to explore, but it also helps me stay organized while informing myself about the car that's most right for me. FYI I do use some paid features, but you can get by on the free plan with this because most of the work is done via the scripting block.
I also figured out a way to scrape Facebook Marketplace Data and import it into this tracker as well using [Octoparse](https://www.octoparse.com/). I COULD DEFINITELY USE HELP TWEAKING THIS HMU IF YOU HAVE THOUGHTS (🙏🏾)
Enjoy!
h7n6f6
whatcarshouldIbuy
JeenyusJane
t3_h7n6f6
https://www.reddit.com/r/whatcarshouldIbuy/comments/h7n6f6/template_car_research/
6/12/2020 3:48:39 PM
6/12/2020 3:52:44 PM
False
False
7
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Template: Car Research
False
1
h7n6f6
0
1
31
31
10
3.03030303030303
0
0
0
0
162
49.0909090909091
330
128, 128, 128
3
Solid
50
No
313
Commented
4/20/2022 6:15:56 AM
Thanks for the suggestions! I would also like to add the following-
**Leadfeeder**\- This tool tells you which companies visit your website, even if they do not fill out a form or otherwise contact you. The Leadfeeder tracker connects your website and your account to Google Analytics to get the details of the actions of corporate visitors to your website.
**Mail Engine**\- This low-cost email marketing automation tool will carry out your email marketing campaign on a pre-set schedule while applying the latest technologies to ensure high open rates and conversions.
**AdPlify**\- This is a social media advertising SAAS that applies advanced analytics, optimization and targeting to enhance, place and manage your ads perfectly on FaceBook.
**Kontentino**\- This software allows you to schedule posts on LinkedIn and collaborate with your team as per a defined workflow. It supports automation of all types of posts on LinkedIn, including images, carousels and videos.
**Google Analytics**\- This is a popular and trusted tool that gives valuable information about the traffic on your website in real-time for free.
**UberSuggest**\- This is a convenient keyword research, traffic analysis and backlink research tool that provides important insights at low costs to help you optimize your website and SEO.
**Postifluence**\- This is an advanced SEO tool that safely builds organic backlinks on leading blog sites in your niche, bringing high-quality traffic to your website at a very low price.
**Global Database**\- This is a handy tool for lead generation and marketing research, especially for B2B companies. It provides contacts and information about millions of key business decision makers. The details include email, phone number, birth date, social links, employment history, link to company profile, employees’ list and more.
i5g79mr
u_Octoparseideas
ajah-wawan
t1_i5g79mr
https://www.reddit.com/r/u_Octoparseideas/comments/t8leck/how_to_conduct_b2b_lead_generation_10_tips_and/i5g79mr/
4/20/2022 6:15:56 AM
1/1/0001 12:00:00 AM
False
False
5
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
t8leck
t3_t8leck
t8leck
0
t8leck
False
False
False
0
1
3
3
15
5.26315789473684
0
0
0
0
153
53.6842105263158
285
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
311
Commented
10/3/2022 2:10:49 PM
> We also tried converting the database file into a CSV file
[like this?](https://pastebin.com/raw/Y9Ti6izh) (2500/18169)
iqvvhdn
webscraping
Goblin80
t1_iqvvhdn
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqvvhdn/
10/3/2022 2:10:49 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
xt5j31
t3_xt5j31
xt5j31
1
xt5j31
False
False
False
0
5
44
44
0
0
0
0
0
0
11
55
20
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
310
RepliedTo
10/3/2022 8:07:44 PM
>like this?
Yes, that's farther than we got lol. So we're looking to get that data into a file type that we can import into [glideapps.com](https://glideapps.com) or [softr.io](https://softr.io)
iqxeddo
webscraping
Intelligent-Age-3129
t1_iqxeddo
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqxeddo/
10/3/2022 8:07:44 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iqvvhdn
t1_iqvvhdn
iqvvhdn
2
xt5j31
True
False
False
1
9
44
44
0
0
0
0
0
0
14
38.8888888888889
36
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
309
RepliedTo
10/4/2022 12:10:02 AM
# HeatList to CSV (unminified)
```javascript
cartesian = (...a) => a.reduce((a, b) => a.flatMap(d => b.map(e => [d, e].flat())));
function parseTable(s, table) {
return cartesian([s], [table.previousElementSibling.textContent],
[...table.querySelectorAll('tr')].slice(1)
.map(t => [...t.querySelectorAll('td')])
.map(x => x.map(y => y.textContent.trim())))
.map(x => x.join(';'))
.flat()
}
function parseDiv(div) {
return [...div.querySelectorAll('table')]
.flatMap(t => parseTable(div.querySelector('strong').textContent.trim(), t))
}
document.body.innerHTML = [...document.querySelectorAll('[id^=TABLE_CODE]')].map(d => parseDiv(d)).flatMap(x => x.join('<br>')).join('<br>')
alert('done.')
```
iqycclg
webscraping
Goblin80
t1_iqycclg
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqycclg/
10/4/2022 12:10:02 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iqxeddo
t1_iqxeddo
iqxeddo
1
xt5j31
False
False
False
2
5
44
44
0
0
0
0
0
0
76
77.5510204081633
98
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
308
RepliedTo
10/4/2022 12:23:34 AM
what's unminified?
iqye2tf
webscraping
Intelligent-Age-3129
t1_iqye2tf
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqye2tf/
10/4/2022 12:23:34 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iqycclg
t1_iqycclg
iqycclg
0
xt5j31
True
False
False
3
9
44
44
0
0
0
0
0
0
1
50
2
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
307
RepliedTo
10/4/2022 12:08:29 AM
1. Open HeatList URL in your web browser (tested on firefox and chromium).
1. Create a new bookmark.
1. Replace the URL of the bookmark with the line below.
1. Click on the mark from your bookmarks toolbar.
1. Wait until the message box that says: "done." appears.
1. The site's content should be replaced with it's CSV representation.
1. The CSV is delimited by ';' (semicolon) not ',' comma.
1. Copy-Paste the updated content of the webpage in a textfile or spreadsheet program.
# HeatList to CSV
```
javascript:(function(){cartesian = (...a) => a.reduce((a, b) => a.flatMap(d => b.map(e => [d, e].flat())));function parseTable(s, table) {return cartesian([s], [table.previousElementSibling.textContent],[...table.querySelectorAll('tr')].slice(1).map(t => [...t.querySelectorAll('td')]).map(x => x.map(y => y.textContent.trim()))).map(x => x.join(';')).flat()}function parseDiv(div) {return [...div.querySelectorAll('table')].flatMap(t => parseTable(div.querySelector('strong').textContent.trim(), t))};document.body.innerHTML = [...document.querySelectorAll('[id^=TABLE_CODE]')].map(d => parseDiv(d)).flatMap(x => x.join('<br>')).join('<br>');alert('done.')})()
```
iqyc5fz
webscraping
Goblin80
t1_iqyc5fz
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqyc5fz/
10/4/2022 12:08:29 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iqxeddo
t1_iqxeddo
iqxeddo
1
xt5j31
False
False
False
2
5
44
44
0
0
0
0
0
0
122
65.945945945946
185
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
306
RepliedTo
10/4/2022 12:18:40 AM
Nice, had no idea you do that right in the browser. Thanks!
iqydgb7
webscraping
Intelligent-Age-3129
t1_iqydgb7
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqydgb7/
10/4/2022 12:18:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iqyc5fz
t1_iqyc5fz
iqyc5fz
0
xt5j31
True
False
False
3
9
44
44
2
16.6666666666667
0
0
0
0
3
25
12
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
305
Commented
10/2/2022 7:42:36 AM
The website seems easy to scrap with few lines of Python. Is it completely impossible for you to use Python or not?
iqq30k3
webscraping
Accomplished-Gap-748
t1_iqq30k3
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqq30k3/
10/2/2022 7:42:36 AM
10/2/2022 8:21:10 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
xt5j31
t3_xt5j31
xt5j31
1
xt5j31
False
False
False
0
2
44
44
1
4.54545454545455
2
9.09090909090909
0
0
8
36.3636363636364
22
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
304
RepliedTo
10/2/2022 4:38:10 PM
At the moment it would be impossible for me/them only because we are not developers. Never used python before. Would that option also be able to keep the data tied to each dancer?
iqro6ia
webscraping
Intelligent-Age-3129
t1_iqro6ia
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqro6ia/
10/2/2022 4:38:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iqq30k3
t1_iqq30k3
iqq30k3
1
xt5j31
True
False
False
1
4
44
44
0
0
1
2.94117647058824
0
0
12
35.2941176470588
34
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
303
RepliedTo
10/2/2022 5:33:38 PM
Yes it can. How often would you like to extract the data ? During the competition, the data is frequently updated or not ?
iqrwx00
webscraping
Accomplished-Gap-748
t1_iqrwx00
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqrwx00/
10/2/2022 5:33:38 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iqro6ia
t1_iqro6ia
iqro6ia
1
xt5j31
False
False
False
2
2
44
44
0
0
0
0
0
0
8
36.3636363636364
22
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
302
RepliedTo
10/2/2022 6:01:03 PM
The data is finalized before the competition starts. So it wouldn't be updated during the competition.
We only need to do this 1x per year, just a few days before the competition start date.
iqs1hjq
webscraping
Intelligent-Age-3129
t1_iqs1hjq
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/iqs1hjq/
10/2/2022 6:01:03 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iqrwx00
t1_iqrwx00
iqrwx00
0
xt5j31
True
False
False
3
4
44
44
0
0
0
0
0
0
17
50
34
128, 128, 128
3
Solid
50
No
312
Posted
10/1/2022 8:36:05 PM
Hi there, I'm helping a friend try to provide a modern, clean look to their local dance competition Heat List page. A program they use for registering and scoring their dancers also generates a url for them to keep track of what time they are scheduled to dance on the floor, called a Heat List.
We want to take the Heat List data and put it into a No-code builder like Glide or [Softr.io](https://Softr.io) for free because the competition is only 1x per year.
We tried Octoparse & Parsehub but their software can't handle the data I guess? *I reached out to their support team and they let me know this.*
We also tried converting the database file into a CSV file so we can import into Glide app or Softr but we can't find a database converting tool that can convert a .dbc file.
Can anyone here help me find the best solution for my friend? Any help/ideas are greatly appreciated!
Here is a sample URL of the data they are looking to scrape:
[http://www.comp-mngr.com/millennium2022/Millennium2022\_HeatLists.htm](http://www.comp-mngr.com/millennium2022/Millennium2022_HeatLists.htm)
xt5j31
webscraping
Intelligent-Age-3129
t3_xt5j31
https://www.reddit.com/r/webscraping/comments/xt5j31/octoparse_and_parsehub_could_not_scrape_my_url/
10/1/2022 8:36:05 PM
10/1/2022 11:20:35 PM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse and Parsehub could not scrape my url because the service runs out of memory. Can someone here help me find a solution?
False
1
xt5j31
0
1
44
44
7
3.53535353535354
0
0
0
0
92
46.4646464646465
198
Red
10
Dash Dot Dot
20
No
909
Posted
8/5/2021 1:45:56 AM
La visualización de datos presenta la información y los datos en patrones visualizados que podrían ayudar a las personas a obtener información de manera efectiva. La [herramienta de data visualización](https://www.octoparse.es/) utiliza elementos visuales como gráficos y tablas para que los datos hablen. Existen muchas herramientas de visualización de datos en el mercado.
¿Cuál es la mejor herramienta de visualización de datos? Aquí hay una lista de las 30 mejores herramientas de visualización de datos en 2021, incluidos sus pros, contras y ejemplos. Luego puedes decidir cuál se adaptaría a tus necesidades.
Las dividimos en dos categorías: herramientas que no requieren programación y solo para desarrolladores. En cada categoría, las herramientas se clasifican en subgrupos según la especialización. Algunas como Tableau tienen una amplia gama de gráficos y tablas; Algunas herramientas como Infogram son bien conocidas para hacer infografías; Algunas herramientas comienzan a ganar popularidad debido a gráficos interactivos como Gephi.
https://preview.redd.it/1sjfxf1p2gf71.png?width=700&format=png&auto=webp&v=enabled&s=a94c3468ac5874a0ef378ad247687142be804b27
## Catálogo
* Herramientas para profesionales no tecnológicos
1. Tablas y gráficos
* [Gratis](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div1)
* [Comercial - Para particulares o empresas](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div2)
* [Comercial - Solo para empresas](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div3)
1. [Infografia](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div4)
2. [Mapas](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div5)
3. [Gráficos de Network ](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div6)
4. [Gráficos Matemáticos](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div7)
* Herramientas para desarrolladores
1. Tablas y gráficos
* [Gratis](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div8)
* [Comercial](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div9)
1. [Mapas](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div10)
2. [Gráficos de Network](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div11)
3. [Gráficos financieros](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div12)
[Conclusión](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div13)
**Herramientas para profesionales no tecnológicos**
**1. Tablas y gráficos**
**Gratis:**
**1)** [**RAWGraphs**](https://rawgraphs.io/)
RAWGraphs es una herramienta web de código abierto y un marco de visualización de datos. Su objetivo es proporcionar un enlace faltante entre las aplicaciones de hoja de cálculo (por ejemplo, Microsoft Excel y Apple Numbers) y los editores de gráficos vectoriales (por ejemplo, Adobe Illustrator y Sketch). Simplemente puede insertar sus datos en RAWGraphs y personalizar sus gráficos y exportarlos como imágenes vectoriales (SVG) o ráster (PNG). Además, los datos cargados en RAW serán procesados solo por el navegador web, lo que garantiza la seguridad de los datos.
**Pros**
* Gratis ycódigo abierto
* Intuitivo y eficiente
* Tiene documento de ayuda
**Contras**
* No tiene muchas opciones ajustables
&#x200B;
**2)**[ **ChartBlocks**](https://www.chartblocks.com/)
ChartBlocks es una herramienta simple de creación de gráficos en línea, y su asistente de importación de datos puede guiarlo paso a paso para mostrarle cómo importar datos y diseñar gráficos. A diferencia de RAWGraphs, puede compartir fácilmente sus gráficos en las redes sociales. También puede exportar gráficos como gráficos vectoriales editables o insertar gráficos en sitios web con una cuenta personal gratuita. También se ofrecen cuentas profesionales y cuentas de élite.
**Pros**
* Hay disponibles planes de pago gratuitos y a precios razonables
* Asistente fácil de usar para importar los datos necesarios
**Contras**
* No está claro qué tan robusta es su API
* No parece tener ninguna capacidad de mapeo.
&#x200B;
**Comercial - para particulares o empresas**
Algunas herramientas de visualización de datos proporcionan diferentes planes pagos para individuos, pequeños equipos y organizaciones. Estas herramientas tienen más funciones y soporte técnico que las gratuitas.
**3)** [**Tableau**](https://www.tableau.com/)
Tableau es famosa en todo el mundo, lo que permite a las personas transformar los datos en visualización efectiva (cuadros, gráficos e incluso mapas). Tableau es una plataforma de análisis muy potente, segura y flexible, y puede arrastrar los datos a Tableau y graficarlos con sus colegas. También puede visualizar informes generados a través de escritorio, navegador, dispositivo móvil o incrustado en cualquier aplicación.
**Pros**
* Cientos de opciones de importación de datos
* Capacidad de mapeo
* Versión pública gratuita disponible
* Muchos videos tutoriales para guiarlo a través de cómo usar Tableau
**Contras**
* Las versiones que no son gratuitas son caras ($ 70 / mes / usuario para el software Tableau Creator)
* La versión pública no te permite mantener privados los análisis de datos
&#x200B;
**4)** [**Power BI**](https://powerbi.microsoft.com/)
Power BI es un conjunto de herramientas de análisis empresarial desarrolladas por Microsoft y, por lo tanto, bien integradas con Microsoft Office. Los usuarios pueden importar cualquier dato, como archivos, carpetas y bases de datos, y ver datos en cualquier lugar mediante el software, el editor web en línea y las aplicaciones móviles. Power BI es gratuito para usuarios individuales y solo cobra $9.9 por cada usuario del equipo por mes. Cualquier persona en el equipo puede analizar datos y tomar decisiones en cualquier momento.
### Pros
* Asequible y relativamente económico
* Ofrece una amplia gama de visualizaciones personalizadas
* Opción para cargar y ver sus datos en Excel
* Pueden importar datos de una amplia gama de fuentes de datos
* Actualizaciones rápidas
**Contras**
* No se puede manejar bien cuando hay relaciones complejas entre tablas
* No proporciona muchas opciones para configurar sus visualizaciones
* Interfaz de usuario atestada
&#x200B;
**5)** [**QlikView**](https://www.qlik.com/us/products/qlikview)
QlikView es una herramienta de inteligencia empresarial que se centra principalmente en los usuarios empresariales de las organizaciones, y los usuarios pueden analizar fácilmente sus datos y utilizar las capacidades de análisis e informes empresariales de QlikView para respaldar la toma de decisiones. QlikView también proporciona una edición personal para que los usuarios individuales puedan disfrutar de sus potentes funciones. Simplemente puede escribir las palabras clave que desea buscar dentro del conjunto de datos, y QlikView puede ayudarlo a encontrar información inesperada y asociaciones de datos.
**Pros**
* Proporciona un ecosistema dinámico de inteligencia empresarial para el usuario
* Compartir datos
* Bajo mantenimiento
* Ofrece muchas opciones de visualización de datos atractivas y coloridas.
**Contras**
* Menos límite de RAM
* Desarrollo de aplicaciones difíciles
* Requiere mucha compra adicional
&#x200B;
**6)** [**FineReport**](http://www.finereport.com/en/?utm_source=Octoparse&utm_medium=media&utm_term=30dvtools&utm_content=30dvtools)
FineReport es un software de informes y paneles con impresionantes efectos de visualización. Proporciona impresionantes gráficos HTML5 de desarrollo propio que se pueden mostrar sin problemas en cualquier sitio web o página web, con efectos 3D y dinámicos geniales. La visualización se adapta a cualquier tamaño de pantalla, desde televisores y pantallas grandes hasta dispositivos móviles. Las operaciones fáciles de arrastrar y soltar logran todos los efectos.
Además, lo sorprendente que descubrí es que FineReport es [GRATUITO para los usuarios individuales.](http://...
oy7hqn
u_melisaxinyue
melisaxinyue
t3_oy7hqn
https://www.reddit.com/r/u_melisaxinyue/comments/oy7hqn/las_30_mejores_herramientas_de_visualización_de/
8/5/2021 1:45:56 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Las 30 Mejores Herramientas de Visualización de Datos en 2021
False
1
oy7hqn
0
7400
5
5
9
0.784655623365301
0
0
0
0
667
58.151700087184
1147
Red
10
Dash Dot Dot
20
No
908
Posted
8/11/2021 9:34:24 AM
Los CAPTCHA son una de las técnicas anti-scraping más populares implementadas por los propietarios de sitios web. [reCaptcha v3](https://www.google.com/recaptcha/) es una solución de integración CAPTCHA de Google para detectar tráfico de bots en sitios web. [NuCaptcha](https://www.nucaptcha.com/), [hCaptcha](https://www.hcaptcha.com/) son algunas otras soluciones CAPTCHA avanzadas. Pero los CAPTCHA son bastante irritantes, no solo para los usuarios sino también para los web scrapers. Resolver CAPTCHA es uno de los [principales desafíos que enfrentan los web scrapers](https://www.octoparse.es/blog/desafios-para-extraer-datos-de-comercio-electronico-web). Lee esta información para encontrar diferentes formas de resolver CAPTCHA mientras extraes el contenido de tu sitio web objetivo. Así es como está estructurado el artículo:
**Tabla de contenidos:**
[¿Qué es un CAPTCHA? ¿Y qué es un reCaptcha?](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h1)
[Tipos populares de CAPTCHA](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h2)
[¿Cómo resolver / omitir reCAPTCHA mientras se raspa?](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h3)
* [Resolución de captcha basada en humanos](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h4)
* [Resolución de CAPTCHA mediante OCR (reconocimiento óptico de caracteres)](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h5)
* [Auto-resolutivo](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h6)
[¿Omitir reCaptcha en Octoparse?](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h7)
[Tips para evitar que los CAPTCHA interrumpan tu experiencia de raspado](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h8)
[Conclusión](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h9)
## ¿Qué es un CAPTCHA? ¿Y qué es un reCaptcha?
La Prueba de Turing pública y automática para diferenciar a las computadoras y los humanos (CAPTCHA) es una prueba basada en audio, visual o textual generada por algoritmos automatizados. Resolver CAPTCHA requiere tres habilidades en las que los humanos son mucho mejores que las computadoras:
* Reconocimiento invariante (identificación de diferentes formas, imágenes del mismo alfabeto, objeto),
* Segmentación (identificación de alfabetos superpuestos)
* Análisis del contexto (comprensión integral de la imagen, el texto o el audio)
reCaptcha es la solución generadora de CAPTCHA más popular. Es de Google y se puede integrar fácilmente en un sitio web.
## ¿Cuáles son algunos tipos populares de CAPTCHA?
1. Captcha Normal
Este es el CAPTCHA más utilizado en el que una imagen distorsionada contiene texto pero es legible por humanos. Para resolver CAPTCHA normal, debes ingresar el texto distorsionado en el cuadro de texto.
2. Captcha de Texto
TextCaptcha no es tan popular, pero es ideal para usuarios con discapacidad visual. Esto no está basado en imágenes, es puramente texto. Un ejemplo de CURL de [TextCaptcha](http://textcaptcha.com/):
TextCaptcha:
$ curl http://api.textcaptcha.com/myemail@example.com.json
{ "q":"If tomorrow is Saturday, what day is today?"
"a":\["f6f7fec07f372b7bd5eb196bbca0f3f4",
"dfc47c8ef18b4689b982979d05cf4cc6"\] }
**CAPTCHA:** Si mañana es sábado, ¿qué día es hoy?
**SOLUCIÓN:** viernes.
3. Key Captcha
[KeyCaptcha](https://www.keycaptcha.com/) es otro servicio de integración CAPTCHA en el que se supone que debes resolver un acertijo
4. Click Captcha
Los CAPTCHA de imagen que se incluyen en los rompecabezas basados en clasificación son los Click CAPTCHAs. reCaptcha, [ASIRRA](https://www.microsoft.com/en-us/research/publication/asirra-a-captcha-that-exploits-interest-aligned-manual-image-categorization/), Snapchat’s Ghost Captcha son ejemplos populares de Click CAPTCHAs basados en clasificación.
5. Rotate Captcha
Estos son rompecabezas CAPTCHA basados en la orientación de la imagen. En Rotate CAPTCHA, debes hacer clic una o varias veces para rotar una imagen de modo que cumpla con los términos de verificación. La condición de verificación más popular es colocar un objeto en la “posición correcta”. [FunCaptcha](https://funcaptcha.com/fc/api/nojs/) es uno de los proveedores de integración "Rotate CAPTCHA", pero parece que [no funciona](https://github.com/ad-m/python-anticaptcha/issues/69). [RVerify.js](https://rverify.vercel.app/) es una biblioteca de JavaScript de código abierto para verificar la orientación de la imagen.
6. GeeTest CAPTCHA
Los [LosGeeTest](https://www.geetest.com/) CAPTCHAs son interesantes, aquí tienes que mover una pieza del rompecabezas, a menudo arrastrando un control deslizante, o tienes que seleccionar ciertas imágenes en un orden particular.
7. hCaptcha
[hCaptcha](https://www.hcaptcha.com/) es muy similar a reCaptcha. La única diferencia es que cuando usamos hCaptcha, varias empresas pueden aprovechar el beneficio del etiquetado de datos que los USUARIOS hacen en los sitios web cuando hacen clic en cualquier sitio web. El uso de reCaptcha solo Google se beneficia del etiquetado de datos de colaboración colectiva.
8. Capy puzzle
Similar a keyCaptcha, Capy Puzzle es un servicio CAPTCHA basado en rompecabezas. [CAPY.ME](https://www.capy.me/products/puzzle_captcha/) CAPY.ME es un servicio para integrar rompecabezas de Capy en sitios web.
Lee más sobre [los tipos de CAPTCHA.](https://www.octoparse.es/blog/5-things-you-need-to-know-of-bypassing-captcha-for-web-scraping)
## ¿Cómo resolver / omitir reCAPTCHA mientras se raspa?
Ya sea que estés raspando usando una herramienta avanzada de raspado de pantalla sin código de "hacer clic y raspar", o tu raspador escrito en Python, Java o Javascript, es posible resolver y omitir todo tipo de CAPTCHA. Aunque ningún servicio / solución garantiza una tasa de resolución de CAPTCHA del 100%, podemos obtener una eficiencia de hasta el 90% utilizando herramientas populares como [DeathByCaptcha](http://deathbycaptcha.com/) y [2captcha](https://2captcha.com/), etc.,
Hay dos enfoques populares para resolver CAPTCHAs
* **Resolución de captcha basada en humanos**
Los CAPTCHAs están hechos para ser resueltos por humanos. Hay empresas que emplean a miles de humanos para resolver estos CAPTCHA en tiempo real, a un precio muy económico. La eficiencia es bastante alta, pero la latencia de tiempo es un problema con este enfoque.
## Entonces, ¿cómo deberías usar un servicio de resolución de CAPTCHA mientras se raspa?
Hay varios proveedores de servicios de resolución de captcha en el mercado, algunos de los cuales son notables:
* DeathByCaptcha
* AZCaptcha
* ImageTyperZ
* EndCaptcha
* BypassCaptcha
* CaptchaTronix
* AntiCaptcha
* 2Captcha
* CaptchaSniper
Todos estos proveedores de servicios tendrían un enfoque similar:
1. Registrarse en su sitio web, obtener un token y las credenciales publicadas pagando el monto, o tal vez de forma gratuita si hay una versión de prueba disponible.
2. Implementar su API / complemento usando un lenguaje de tu elección, es decir, Python, PHP, Java, JS, etc.
3. Envíar tus CAPTCHAs a sus API
4. Recibir los CAPTCHAs resueltos en la respuesta de la API
* **Resolución de CAPTCHAs mediante OCRs (reconocimiento óptico de caracteres)**
Este es un enfoque programático para resolver CAPTCHAs. OCR significa reconocimiento óptico de caracteres o lector óptico de caracteres. OCR es un enfoque electrónico o mecánico para convertir texto mecanografiado, escrito a mano o impreso en texto codificado por máquina. Puedes alimentar un documento escaneado, una imagen o una escena (ejemplo: Billboards) a los OCRs. Existen herramientas de código abierto como [TESSERACT](https://github.com/tesseract-ocr/tesseract), [GOCR](https://en.wikipedia.org/wiki/GOCR), [OCRAD](https://github.com/antimatter15/ocrad.js/), etc., para que puedas comenzar, por lo que no es necesario que comiences desde cero. Los OCRs tienen la capacidad de resolver con éxito diferentes tipos de CAPTCHA basados en imágenes.
* **Auto-resolutivo**...
p29oud
u_melisaxinyue
melisaxinyue
t3_p29oud
https://www.reddit.com/r/u_melisaxinyue/comments/p29oud/cómo_resolver_captcha_mientras_se_raspa_la_web/
8/11/2021 9:34:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
¿Cómo resolver captcha mientras se raspa la web?
False
1
p29oud
0
7400
5
5
5
0.407830342577488
1
0.0815660685154976
0
0
686
55.9543230016313
1226
Red
10
Dash Dot Dot
20
No
907
Posted
11/13/2020 9:48:46 AM
Todos sabemos lo importante que son los datos hoy en día. Cómo podemos maximizar el valor de los datos web para nuestros mejores intereses. [**Web Scraping**](https://www.octoparse.es/) es el proceso de obtener datos de cualquier sitio web en un formato estructurado como CSV y Excel. Le ahorra un valioso tiempo dedicado a las tareas para concentrarse en lo que realmente importa al automatizar todo el proceso.
Describí las 25 formas más populares de hacer crecer su negocio con el web scraping. Estoy seguro de que te dará un impulso y generará algunas ideas.
**Tabla de contenidos**
* [Marketing](https://www.octoparse.es/blog/25-maneras-crecer-su-negocio)
* [Comercio electrónico y Venta Minorista](https://www.octoparse.es/blog/25-maneras-crecer-su-negocio)
* [Ciencia de los Datos](https://www.octoparse.es/blog/25-maneras-crecer-su-negocio)
* [Equidad e Investigación Financiera](https://www.octoparse.es/blog/25-maneras-crecer-su-negocio)
* [Data Journalism](https://www.octoparse.es/blog/25-maneras-crecer-su-negocio)
* [Académico](https://www.octoparse.es/blog/25-maneras-crecer-su-negocio)
* [Gestión de Riesgos](https://www.octoparse.es/blog/25-maneras-crecer-su-negocio)
* [Seguro](https://www.octoparse.es/blog/25-maneras-crecer-su-negocio)
* [Otros](https://www.octoparse.es/blog/25-maneras-crecer-su-negocio)
&#x200B;
**Marketing**
* [**Content Marketing:**](https://service.octoparse.com/socialmedia)
Es difícil concebir ideas notables para sus próximas publicaciones de blog para superar a sus competidores. Deje de perder el tiempo mirando las páginas de resultados de búsqueda de Google. Puede raspar toda la información, incluidos los resultados de búsqueda de Google, en una sola hoja de cálculo. A continuación, obtenga una idea general de qué tema es más probable que se clasifique y cómo se ven sus títulos y descripciones.
&#x200B;
* **Monitoreo Competitivo:**
El monitoreo competitivo generalmente necesita obtener datos de varios sitios web al mismo tiempo. Para mantener el ritmo, es posible que también deba extraer la información de forma regular. Las herramientas de raspado web como Octoparse automatizan todo el proceso de extracción de datos.
&#x200B;
* **Generación líder:**
Los leads son muy importantes para que cualquier negocio sobreviva. Si está listo para escalar, está en juego la necesidad de más clientes potenciales. Deje de quemar su dinero por clientes potenciales que no pueden convertir. Las [**Web scraping tools**](https://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping) pueden scrape datos en los sitios web.
&#x200B;
* **SEO Monitoring:**
Supervisar los esfuerzos de SEO mediante la extracción de palabras clave relacionadas con resultados y clasificaciones. El web scraping le permite comprender por qué y cómo los competidores pueden superar su posición.
&#x200B;
* **Monitoreo de Marca:**
Mantener su imagen en línea puede ser tedioso ya que tiene que mirar la pantalla todo el día. Puede obtener publicaciones y comentarios negativos y positivos en tiempo real. Además, puede ayudarlo a detectar mensajes de fraude a tiempo.
&#x200B;
https://preview.redd.it/pxmfx0abbzy51.png?width=640&format=png&auto=webp&v=enabled&s=7fe5eb6f101ab451d5b7bedfd5fae56ed7907389
## Comercio Electrónico y Venta Minorista
&#x200B;
* [**Inteligencia de Precios:**](https://service.octoparse.com/ecommercedata)
Es difícil mantener a los clientes cuando aumenta el precio, pero es necesario reducir el costo marginal y elevar las ganancias. ¿Cuál es el precio perfecto para su producto? Aquí es donde entra en juego el web scraping. Puede extraer precios con la misma información de diferentes fuentes. A continuación, preste atención a las estrategias de precios implementadas por otros. ¿Ellos tienen algún evento de promoción? ¿Reducen sus precios?
&#x200B;
* **Cumplimiento de MAP:**
Cuando tiene múltiples canales de distribución en diferentes tiendas y países, es difícil administrar la forma en que fijan el precio de su producto. Con la ayuda del raspado web, los fabricantes pueden extraer información de productos y precios. Por lo tanto, es mucho más fácil detectar quién viola el MAP.
&#x200B;
* **Inteligencia de Producto:**
Descubrir el producto más vendido es un desafío. La extracción de datos web automatiza el proceso para extraer las listas y categorías de productos, lo que proporciona información sobre los productos más vendidos. Además, la recopilación de información del producto nos beneficiaría para tomar buenas decisiones sobre el surtido de productos.
## Data Science
&#x200B;
* **Procesamiento Natural del Lenguaje:**
Apuesto a que estás familiarizado con el término PNL. En la mayoría de los casos, NLP se utiliza como un medio para analizar el sentimiento del cliente. el raspado web es la mejor manera de proporcionar un flujo continuo de datos para alimentar el algoritmo hambriento de ideas.
&#x200B;
* **Modelos de Entrenamiento de Aprendizaje Automático:**
El aprendizaje automático es una palabra de moda en estos días. Básicamente, implica que arrojamos un montón de datos al modelo. Luego, el modelo estudiará el modelo y construirá su propia lógica.. Cuantos más datos le asignes, más preciso será el resultado que genere. En este sentido, la extracción de datos web es ideal para extraer datos valiosos de múltiples fuentes a escala en un corto período de tiempo.
&#x200B;
* **Análisis Predictivo:**
Web scraping juega un papel importante en el análisis predictivo, ya que recoge los datos para predecir y pronosticar las tendencias. La predicción precisa ayuda a las empresas a estimar el mercado futuro, descubrir riesgos imprevistos y obtener una ventaja competitiva.
&#x200B;
https://preview.redd.it/f2kn6n2cbzy51.png?width=640&format=png&auto=webp&v=enabled&s=5368e8ccd7858e770e1dd27ef8880e8af3e24e33
## Equidad e Investigación Financiera
&#x200B;
* **Agregación de Noticias:**
Recopilar y mantener artículos de noticias en todo el periódico es una tarea difícil. Puede utilizar la herramienta de extracción de datos para recopilar artículos de noticias. Mejor aún, puede crear una fuente de nicho para sus lectores con información actualizada al buscar fuentes RSS de diferentes blogs.
&#x200B;
* [**Los fondos de Cobertura:**](https://www.octoparse.com/blog/how-web-scrapping-helps-hedge-funds-gain-competitive-edge)
La industria de los fondos de cobertura es uno de los primeros en adoptar la extracción de datos web para evaluar los riesgos de inversión y las posibles oportunidades comerciales. A partir de hoy, las empresas de inversión tienden a gastar más dinero en obtener los datos para guiar las decisiones de inversión.
&#x200B;
* **Estado Financiero::**
Recopilar estados financieros de muchos recursos en un formato estructurado puede ser un trabajo bastante desalentador. Revisar manualmente cientos de miles de documentos para su análisis puede retrasarlo. No funcionará en un entorno de trabajo acelerado como el departamento de finanzas. El web scraping puede recopilar automáticamente informes financieros en formatos utilizables, por lo que se pueden tomar decisiones de inversión importantes a tiempo.
&#x200B;
* **Investigación de Mercado:**
Lleve a cabo una exhaustiva investigación de marketing que pueda ayudar al equipo de marketing con una planificación más efectiva. Extracción de datos web facilita la obtención de datos de múltiples sitios de redes sociales para obtener información y alimentar su estrategia de marketing.
**Periodismo de Datos**
No es escribir un informe de noticias que lo dificulta, sino descubrir la verdad. Eso es lo que hace que los periodistas basados en datos sean notables. Utilizan el enfoque científico para analizar los datos y la información. La extracción de datos web les brinda a los periodistas la capacidad de crear su propia base de datos con información recopilada, lo que les permite descubrir nuevas historias de Internet.
## Academic
Todos hemos estado allí -- recorra página por...
jtedz8
u_melisaxinyue
melisaxinyue
t3_jtedz8
https://www.reddit.com/r/u_melisaxinyue/comments/jtedz8/25_maneras_de_web_scraping_técnicas_para_crecer/
11/13/2020 9:48:46 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
25 Maneras de Web Scraping Técnicas para Crecer Negocio
False
1
jtedz8
0
7400
5
5
6
0.505050505050505
2
0.168350168350168
0
0
662
55.7239057239057
1188
Red
10
Dash Dot Dot
20
No
906
Posted
8/18/2021 6:40:18 AM
[El web scraping](https://www.octoparse.es/) es una forma de extraer datos de la web mediante herramientas y tecnologías de automatización. Anteriormente, las empresas eran muy informales con la recopilación de datos web. Pero con el inicio de [las regulaciones RGPD (o GDPR)](https://www.legislation.gov.uk/eur/2016/679/contents), la debida diligencia con respecto a la extracción de datos es imprescindible.
Recientemente, Polonia impuso una [multa de 220.000 euros](https://www.achievedcompliance.com/poland-imposes-fines-for-web-scraping-of-personal-data-when-notification-to-individuals-did-not-occur/) a una organización que recopiló datos de alrededor de 7 millones de personas, pero no les informó (informar a las personas es una regla según el artículo 14 del RGPD). Además, hace unos meses, la DPA francesa emitió una guía relacionada con el web scraping comercial. Entonces, pensamos en explicar qué significa GDPR y por qué es importante para la comunidad de scraping. Lee este artículo para saber todo lo que necesitas para **cumplir con el RGPD** mientras raspando la web.
**Tabla de contenidos**
* [¿Cuándo entra en juego el RGPD?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h1)
* [¿Qué califica como información de identificación personal (PII)?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h2)
* [¿Estás raspando la información personal de los ciudadanos de la UE?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h3)
* [¿Tienes una base legal para raspar datos personales?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h4)
* [¿Qué puedes hacer para cumplir con el RGPD?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h5)
* [Conclusión](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h6)
## ¿Cuándo entra en juego el RGPD?
Primero, echamos un vistazo a lo que se puede extraer de la web y, luego, analizamos qué tipo de datos se incluyen en el RGPD y cuáles no.
Puedes raspar:
* Anuncios inmobiliarios para realizar marketing personalizado,
* índices accionarios, portales de noticias para la inteligencia de mercado,
* Publicaciones de trabajo para impulsar tus servicios de RR.HH,
* Sitios de redes sociales para analizar los sentimientos de los clientes,
* Directorios online para prospección,
* Datos públicos de sitios web gubernamentales para obtener perspectivas,
* Datos de productos de sitios de comercio electrónico para seguimiento de la competencia e inteligencia de precios,
* Blogs, videos y todo eso.
Por supuesto, los casos de uso de la [extracción de datos](https://helpcenter.octoparse.es/hc/es/articles/360055954154-Extraer-datos) no se limitan a estos, sino que a un nivel amplio, esto te da una idea sobre los diferentes tipos de datos que puedes extraer. Ahora, **RGPD**, que significa **Reglamento General de Protección de Datos (UE) 2016/679**, es una ley en la Unión Europea (UE) sobre protección de datos y privacidad de todas las personas dentro de la UE y el EEE. GDPR tiene dos propósitos:
* Pone a las personas en control de cómo se utilizan sus datos
* Simplifica el entorno regulatorio para las empresas que operan en la región de la UE
La pregunta es, ¿en qué terreno se cruzan el raspado de datos y el RGPD? ¿Cuándo deberías preocuparte por el RGPD? Una respuesta corta sería, cada vez que se extraigas la información personal de un individuo / ciudadano que resida en la UE.
Para saber si necesitas cumplir con GDPR o no, y para asegurarte de que tu proyecto de raspado cumpla con GDPR, encuentra las respuestas a las siguientes preguntas:
* ¿Qué califica como información de identificación personal (PII)?
* ¿Estás raspando la información personal de los ciudadanos de la UE?
* ¿Tienes una base legal para raspar datos personales?
* ¿Qué puedes hacer para cumplir con el RGPD?
[Original Image](https://preview.redd.it/6q4otfh1b2i71.png?width=1600&format=png&auto=webp&v=enabled&s=06acea9a7921218a9a63abd61d4fe5bc61ed9b2e)
# ¿Qué califica como información de identificación personal (PII)?
## Cualquier dato que pueda ayudar a alguien a rastrear o identificar a una persona calificaría como PII. Algunos ejemplos pueden ser:
* Nombre
* Email
* Números de Contacto
* Direccion postal
* Detalles de la tarjeta de crédito
* Detalles del banco
* Dirección IP
* DOB
* Imagen / video / audio de la persona
* Informes médicos
* Detalles de empleo, etc.,
## ¿Estás raspando la información personal de los ciudadanos de la UE?
El RGPD se ocupa estrictamente de la información de identificación personal de las personas dentro de la Unión Europea y el Espacio Económico Europeo (EEE). Entonces, la siguiente pregunta que surge es **¿estás raspando datos de ciudadanos europeos?** Si la respuesta es un "No", entonces estás a salvo. Por lo tanto, digamos que si estás extrayendo datos que conciernen a India, EE. UU. O Australia, no debes preocuparte por el RGPD. En su lugar, debes buscar leyes de protección de datos dentro de su jurisdicción respectiva. La jurisdicción de RGPD se limita al EEE. Si tus proyectos de raspado necesitan que raspes la PII de los ciudadanos de la UE, debes tener una base legal para hacerlo.
## ¿Tienes una base legal para raspar datos personales?
Las bases legales se establecen en [el artículo 6 del RGPD](https://www.legislation.gov.uk/eur/2016/679/article/6), y existen seis bases legales para el procesamiento de datos extraídos:
1. Consentimiento
Esta puede ser tu base legal cuando las personas, de las que estás extrayendo datos, te han dado su consentimiento para extraer sus datos para fines específicos.
2. Contrato
El contrato con las personas interesadas puede tener una base legal bajo RGPD si el contrato necesariamente requiere que tú proceses los datos.
3. Obligación Legal
El tercer tipo de base legal podría ser si el procesamiento de datos es necesario para que tú cumpla con una obligación legal.
4. Intereses Vitales
Puedes argumentar que *Intereses Vitales* es la base legal para tu proyecto de raspado si está destinado a salvar la vida de alguien.
5. Tareas Públicas
Cuando el tratamiento de los datos se realice por interés público o para el desempeño de tus funciones como funcionario, se contará como base jurídica.
6. Interés Legítimo
Si el procesamiento de datos es necesario para el interés legítimo del controlador de datos, también puedes contarlo como una base del procesamiento legal de datos bajo RGPD. Pero esta no será la base legal si anula los derechos o intereses fundamentales de una persona cuyos datos se recopilan y procesan.
En resumen, consentimiento y contrato son más o menos lo mismo. Si las personas te han dado su consentimiento, está bien procesar sus datos. ¿Cuándo será aplicable? Tomamos un ejemplo. Supongamos que existe un sitio web de venta minorista de moda que recopila reseñas de productos de los compradores, así como la PII del comprador, y la pone a disposición del público en las reseñas. La PII podría ser la edad, el nombre y la ubicación. Los datos generales serían el texto de revisión y el tiempo. Ahora, si necesitas raspar solo el texto de revisión para la investigación para impulsar el desarrollo de tu nuevo producto, entonces no debes preocuparte por RGPD. Pero si también estás raspando el nombre, la edad, la ubicación y otros detalles, entonces estás ingresando a la zona de PII y debes cumplir con RGPD para abordar el cumplimiento legal.
Los intereses vitales, las tareas públicas y las obligaciones legales rara vez formarían tus bases legales, ya que son conceptos claros y no hay mucho espacio para argumentos teóricos. Pero el interés legítimo podría ser tu base legal sólida si estás haciendo raspado web. Pero para la mayoría de las empresas, afirmar que esto también es un desafío.
El caso de [HiQ vs Linkedin](https://es.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn) también es una lectura interesante.
## ¿Qué puedes hacer para cumplir con el RGPD?
Aquí hay una lista de verificación para que te asegures de que tu proyecto de procesamiento de datos y raspado cumpla con el RGPD:
* **Mantener alejado de la interpretación incorrecta de los artículos en ...
p6lsm1
u_melisaxinyue
melisaxinyue
t3_p6lsm1
https://www.reddit.com/r/u_melisaxinyue/comments/p6lsm1/cumplimiento_de_rgpd_en_web_scraping/
8/18/2021 6:40:18 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Cumplimiento de RGPD en web scraping
False
1
p6lsm1
0
7400
5
5
0
0
0
0
0
0
720
54.5041635124905
1321
Red
10
Dash Dot Dot
20
No
905
Posted
8/11/2021 9:34:24 AM
Los CAPTCHA son una de las técnicas anti-scraping más populares implementadas por los propietarios de sitios web. [reCaptcha v3](https://www.google.com/recaptcha/) es una solución de integración CAPTCHA de Google para detectar tráfico de bots en sitios web. [NuCaptcha](https://www.nucaptcha.com/), [hCaptcha](https://www.hcaptcha.com/) son algunas otras soluciones CAPTCHA avanzadas. Pero los CAPTCHA son bastante irritantes, no solo para los usuarios sino también para los web scrapers. Resolver CAPTCHA es uno de los [principales desafíos que enfrentan los web scrapers](https://www.octoparse.es/blog/desafios-para-extraer-datos-de-comercio-electronico-web). Lee esta información para encontrar diferentes formas de resolver CAPTCHA mientras extraes el contenido de tu sitio web objetivo. Así es como está estructurado el artículo:
**Tabla de contenidos:**
[¿Qué es un CAPTCHA? ¿Y qué es un reCaptcha?](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h1)
[Tipos populares de CAPTCHA](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h2)
[¿Cómo resolver / omitir reCAPTCHA mientras se raspa?](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h3)
* [Resolución de captcha basada en humanos](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h4)
* [Resolución de CAPTCHA mediante OCR (reconocimiento óptico de caracteres)](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h5)
* [Auto-resolutivo](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h6)
[¿Omitir reCaptcha en Octoparse?](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h7)
[Tips para evitar que los CAPTCHA interrumpan tu experiencia de raspado](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h8)
[Conclusión](https://www.octoparse.es/blog/como-resolver-captcha-mientras-se-raspa-la-web#h9)
## ¿Qué es un CAPTCHA? ¿Y qué es un reCaptcha?
La Prueba de Turing pública y automática para diferenciar a las computadoras y los humanos (CAPTCHA) es una prueba basada en audio, visual o textual generada por algoritmos automatizados. Resolver CAPTCHA requiere tres habilidades en las que los humanos son mucho mejores que las computadoras:
* Reconocimiento invariante (identificación de diferentes formas, imágenes del mismo alfabeto, objeto),
* Segmentación (identificación de alfabetos superpuestos)
* Análisis del contexto (comprensión integral de la imagen, el texto o el audio)
reCaptcha es la solución generadora de CAPTCHA más popular. Es de Google y se puede integrar fácilmente en un sitio web.
## ¿Cuáles son algunos tipos populares de CAPTCHA?
1. Captcha Normal
Este es el CAPTCHA más utilizado en el que una imagen distorsionada contiene texto pero es legible por humanos. Para resolver CAPTCHA normal, debes ingresar el texto distorsionado en el cuadro de texto.
2. Captcha de Texto
TextCaptcha no es tan popular, pero es ideal para usuarios con discapacidad visual. Esto no está basado en imágenes, es puramente texto. Un ejemplo de CURL de [TextCaptcha](http://textcaptcha.com/):
TextCaptcha:
$ curl http://api.textcaptcha.com/myemail@example.com.json
{ "q":"If tomorrow is Saturday, what day is today?"
"a":\["f6f7fec07f372b7bd5eb196bbca0f3f4",
"dfc47c8ef18b4689b982979d05cf4cc6"\] }
**CAPTCHA:** Si mañana es sábado, ¿qué día es hoy?
**SOLUCIÓN:** viernes.
3. Key Captcha
[KeyCaptcha](https://www.keycaptcha.com/) es otro servicio de integración CAPTCHA en el que se supone que debes resolver un acertijo
4. Click Captcha
Los CAPTCHA de imagen que se incluyen en los rompecabezas basados en clasificación son los Click CAPTCHAs. reCaptcha, [ASIRRA](https://www.microsoft.com/en-us/research/publication/asirra-a-captcha-that-exploits-interest-aligned-manual-image-categorization/), Snapchat’s Ghost Captcha son ejemplos populares de Click CAPTCHAs basados en clasificación.
5. Rotate Captcha
Estos son rompecabezas CAPTCHA basados en la orientación de la imagen. En Rotate CAPTCHA, debes hacer clic una o varias veces para rotar una imagen de modo que cumpla con los términos de verificación. La condición de verificación más popular es colocar un objeto en la “posición correcta”. [FunCaptcha](https://funcaptcha.com/fc/api/nojs/) es uno de los proveedores de integración "Rotate CAPTCHA", pero parece que [no funciona](https://github.com/ad-m/python-anticaptcha/issues/69). [RVerify.js](https://rverify.vercel.app/) es una biblioteca de JavaScript de código abierto para verificar la orientación de la imagen.
6. GeeTest CAPTCHA
Los [LosGeeTest](https://www.geetest.com/) CAPTCHAs son interesantes, aquí tienes que mover una pieza del rompecabezas, a menudo arrastrando un control deslizante, o tienes que seleccionar ciertas imágenes en un orden particular.
7. hCaptcha
[hCaptcha](https://www.hcaptcha.com/) es muy similar a reCaptcha. La única diferencia es que cuando usamos hCaptcha, varias empresas pueden aprovechar el beneficio del etiquetado de datos que los USUARIOS hacen en los sitios web cuando hacen clic en cualquier sitio web. El uso de reCaptcha solo Google se beneficia del etiquetado de datos de colaboración colectiva.
8. Capy puzzle
Similar a keyCaptcha, Capy Puzzle es un servicio CAPTCHA basado en rompecabezas. [CAPY.ME](https://www.capy.me/products/puzzle_captcha/) CAPY.ME es un servicio para integrar rompecabezas de Capy en sitios web.
Lee más sobre [los tipos de CAPTCHA.](https://www.octoparse.es/blog/5-things-you-need-to-know-of-bypassing-captcha-for-web-scraping)
## ¿Cómo resolver / omitir reCAPTCHA mientras se raspa?
Ya sea que estés raspando usando una herramienta avanzada de raspado de pantalla sin código de "hacer clic y raspar", o tu raspador escrito en Python, Java o Javascript, es posible resolver y omitir todo tipo de CAPTCHA. Aunque ningún servicio / solución garantiza una tasa de resolución de CAPTCHA del 100%, podemos obtener una eficiencia de hasta el 90% utilizando herramientas populares como [DeathByCaptcha](http://deathbycaptcha.com/) y [2captcha](https://2captcha.com/), etc.,
Hay dos enfoques populares para resolver CAPTCHAs
* **Resolución de captcha basada en humanos**
Los CAPTCHAs están hechos para ser resueltos por humanos. Hay empresas que emplean a miles de humanos para resolver estos CAPTCHA en tiempo real, a un precio muy económico. La eficiencia es bastante alta, pero la latencia de tiempo es un problema con este enfoque.
## Entonces, ¿cómo deberías usar un servicio de resolución de CAPTCHA mientras se raspa?
Hay varios proveedores de servicios de resolución de captcha en el mercado, algunos de los cuales son notables:
* DeathByCaptcha
* AZCaptcha
* ImageTyperZ
* EndCaptcha
* BypassCaptcha
* CaptchaTronix
* AntiCaptcha
* 2Captcha
* CaptchaSniper
Todos estos proveedores de servicios tendrían un enfoque similar:
1. Registrarse en su sitio web, obtener un token y las credenciales publicadas pagando el monto, o tal vez de forma gratuita si hay una versión de prueba disponible.
2. Implementar su API / complemento usando un lenguaje de tu elección, es decir, Python, PHP, Java, JS, etc.
3. Envíar tus CAPTCHAs a sus API
4. Recibir los CAPTCHAs resueltos en la respuesta de la API
* **Resolución de CAPTCHAs mediante OCRs (reconocimiento óptico de caracteres)**
Este es un enfoque programático para resolver CAPTCHAs. OCR significa reconocimiento óptico de caracteres o lector óptico de caracteres. OCR es un enfoque electrónico o mecánico para convertir texto mecanografiado, escrito a mano o impreso en texto codificado por máquina. Puedes alimentar un documento escaneado, una imagen o una escena (ejemplo: Billboards) a los OCRs. Existen herramientas de código abierto como [TESSERACT](https://github.com/tesseract-ocr/tesseract), [GOCR](https://en.wikipedia.org/wiki/GOCR), [OCRAD](https://github.com/antimatter15/ocrad.js/), etc., para que puedas comenzar, por lo que no es necesario que comiences desde cero. Los OCRs tienen la capacidad de resolver con éxito diferentes tipos de CAPTCHA basados en imágenes.
* **Auto-resolutivo**...
p29oud
u_melisaxinyue
melisaxinyue
t3_p29oud
https://www.reddit.com/r/u_melisaxinyue/comments/p29oud/cómo_resolver_captcha_mientras_se_raspa_la_web/
8/11/2021 9:34:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
¿Cómo resolver captcha mientras se raspa la web?
False
1
p29oud
0
7400
5
5
Red
10
Dash Dot Dot
20
No
904
Posted
5/27/2020 8:46:20 AM
**Manejar AJAX y JavaScript**
Hablando sobre el manejo de AJAX y JavaScript mientras se raspa la web, a veces puede ser complicado, especialmente cuando eres un novato en tecnología.
Últimamente he recibido muchas preguntas sobre cómo scrape AJAX y JavaScript. He recopilado algunas de las preguntas más frecuentes de los clientes.
. ¿Cómo scrape un sitio web AJAX de desplazamiento infinito?
. ¿Cómo scrape datos y hago clic en el botón cargar o en el botón Siguiente?
. ¿Cómo scrape los sitios web con contenido AJAX (como Gumtree)?
. ¿Se puede usar Octoparse para raspar contenido dinámico de sitios web que usan AJAX?
. ¿Puedo scrape datos del sitio web con paginación?
. ¿Puedo scrape sitios web que cargan datos dinámicamente (como Facebook)?
. ¿Puedo rastrear un sitio web que carga contenido usando Javascript?
......
[*Lidiando con el desplazamiento infinito/cargar más*](http://www.octoparse.es/tutorial-7/infinite-scrolling-and-load-more)
[*Tratar con AJAX*](http://www.octoparse.es/tutorial-7/ajax)
[*Extrección Incremental:Obtenga datos actualizados fácilmente*](http://www.octoparse.es/tutorial-7/obtenga-datos-actualizados-f%C3%A1cilmente)
[*¿Cómo manejar la paginación con números de página?*](http://www.octoparse.es/tutorial-7/paginacion-con-numeros-de-pagina)
[*Autodetección AJAX*](http://www.octoparse.es/tutorial-7/autodeteccion-ajax)
&#x200B;
**Scraping de Páginas Web con AJAX No es Fácil**
A veces las personas ven páginas web y encuentran que el contenido de AJAX se está cargando en la web pero piensan que el sitio no puede ser scraped. Si está aprendiendo Python y está sumergiendo su mano en la construcción de un raspador web. No va a ser muy fácil. Si está buscando una manera fácil y rápida de hacer esto, especialmente para grandes cargas de trabajo, es posible que desee buscar aplicaciones de terceros para extraer datos de páginas web con AJAX.
**Ejemplo: Scrape Websites con Desplazamiento Infinito**
Entonces, como ejemplo, lo que voy a mostrar es cómo scrape sitios web con desplazamiento infinito. (Si eres un programador experimentado y escribes tu asombroso herramientas de raspado web, simplemente ignora mi galimatías).
[Vea aquí cómo manejarc y scrape los websites de desplazamiento infinito.](http://www.octoparse.es/tutorial-7/infinite-scrolling-and-load-more)
grg0eq
u_melisaxinyue
melisaxinyue
t3_grg0eq
https://www.reddit.com/r/u_melisaxinyue/comments/grg0eq/web_scraping_scraping_de_ajax_y_javascript/
5/27/2020 8:46:20 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping - Scraping de AJAX y JavaScript Websites
False
0.99
grg0eq
0
7400
5
5
0
0
0
0
0
0
212
59.2178770949721
358
Red
10
Dash Dot Dot
20
No
903
Posted
7/24/2020 8:43:29 AM
Por favor haga clic en el artículo original or más blog: [Web Scraping API](https://www.octoparse.es/blog/web-scraping-api-para-extraccion-de-datos)
## ¿Qué es una API?
Wikipedia dice que: “En la [computer programming](https://en.wikipedia.org/wiki/Computer_programming), una interfaz de programación de aplicaciones (API) es un conjunto de definiciones de [subrutinas](https://en.wikipedia.org/wiki/Subroutine), [protocols](https://en.wiktionary.org/wiki/Protocol) y herramientas para construir [software de aplicaciones](https://en.wikipedia.org/wiki/Application_software). En términos generales, es un conjunto de métodos de comunicación claramente definidos entre varios componentes de software"
En general, la API web es un conjunto de reglas que los desarrolladores deben seguir cuando interactúan con un lenguaje de programación. Al igual que Harry Potter debe decir "Alohomora" para abrir una puerta.
**Una idea errónea que la mayoría de la gente tiene es que API puede extraer datos.** No es completamente cierto ya que solo es responsable de buscar los datos de acuerdo con los recursos dedicados. En la mayoría de los casos, obtendrá solo lo que solicita. Sin embargo, no tiene acceso a otra información
**Web Scraping API**
Por ejemplo, si desea realizar un análisis de opinión y necesita revisiones y comentarios, se utiliza **una API web** para enviar su solicitud de esa palabra clave a un servidor web y, a cambio, el servidor le proporciona revisiones o comentarios en un formato de datos sin procesar. Los datos de formato sin formato no necesariamente parecen fáciles de usar, como las filas y columnas de la hoja de cálculo.
Como tal, para "consumir los datos" de una página de producto, debemos seguir algunos pasos para un proceso intacto de extracción, transformación al almacenamiento. A veces, incluso tiene que convertir los datos sin formato al formato deseado. Parece una tarea fácil para programadores experimentados. Sin embargo, la complejidad todavía frustra a las personas que no tienen experiencia en programación pero que necesitan más datos.
&#x200B;
**API estándar y API avanzada**
Para reducir la complejidad, es mejor tener una herramienta de raspado web con alguna integración de API que pueda extraer y transformar los datos al mismo tiempo sin escribir ningún código.
Octoparse es una herramienta intuitiva de web scraping diseñada para que los no codificadores extraigan datos de cualquier sitio web. Sus ingenieros de software crean la integración API para que pueda lograr dos cosas:
**1.** **Extraiga los datos del sitio web sin la necesidad de esperar la respuesta de un servidor web.**
**2. Envíe los datos extraídos automáticamente de la nube a sus aplicaciones internas a través de la integración de API de Octoparse**
Además de la flexibilidad, le permite convertir datos sin procesar en formularios como Excel, CSV según lo necesite. Otro beneficio es que puede funcionar según lo planeado que elimina la complicidad durante la extracción manual de datos.
En caso de que nunca haya usado Octoparse, permítame explicarle en detalle cómo puede usar Octoparse para extraer datos y transmitirlos a su base de datos.
Octoparse tiene dos tipos de API. El primero es la [Standard API](https://helpcenter.octoparse.es/hc/es/articles/360040826634--Cu%C3%A1l-es-la-diferencia-entre-API-est%C3%A1ndar-y-API-avanzada-). **Una API estándar** puede hacer todos los trabajos como mencioné anteriormente. Puede usarlo para extraer datos en un sistema CRM o una herramienta de visualización de datos para generar informes hermosos
La segunda API se llama [**Advanced API**](http://advancedapi.octoparse.com/help). Es un superconjunto de la API estándar. Hace todo lo que hace la API estándar. Mejor aún, puede acceder y manipular los datos almacenados en la nube. A medida que el modelo de negocio basado en datos se ha vuelto más popular, se espera que las personas sin conocimientos de codificación usen diferentes herramientas para extraer datos. Si también está frustrado al usar una API, encontrará un gran valor en [Octoparse](https://www.octoparse.es/product) ya que su proceso de integración es fácil
hwyabt
api
melisaxinyue
t3_hwyabt
https://www.reddit.com/r/api/comments/hwyabt/web_scraping_api_una_guía_para_principiantes/
7/24/2020 8:43:29 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping API: Una Guía para Principiantes
False
0.4
hwyabt
0
7400
5
5
5
0.75642965204236
0
0
0
0
351
53.1013615733737
661
Red
10
Dash Dot Dot
20
No
902
Posted
10/30/2020 4:26:41 AM
El contenido es la forma más básica de atraer tráfico - sin cierta cantidad de contenido de calidad, ni Google ni los visitantes estarían interesados en su sitio web porque hay poco valor que puedan obtener navegándolo.
Aquí hay 2 soluciones principales sin codificación para extraer contenido de sitios web y construir su base de contenido: elija una o las dos y pruébela.
## Tableta de Contenidos
[Extraer contenido desde sitios web usando herramienta de web scraping](http://www.octoparse.es/blog/2-formas-de-extraer-contenido-de-sitios-web-sin-codificaci%C3%B3n-para-aumentar-el-tr%C3%A1fico-web#h1)
[Extraer contenido desde sitios web usando herramientas de agregación de contenido](http://www.octoparse.es/blog/2-formas-de-extraer-contenido-de-sitios-web-sin-codificaci%C3%B3n-para-aumentar-el-tr%C3%A1fico-web#h2)
[Conclusión](http://www.octoparse.es/blog/2-formas-de-extraer-contenido-de-sitios-web-sin-codificaci%C3%B3n-para-aumentar-el-tr%C3%A1fico-web#h3)
# Extraer contenido desde sitios web usando herramienta de Web Scraping
El web scraping es el proceso de extraer información de un sitio web sin usar una API para obtener el contenido, pero debe seguir los requisitos de robots.txt del sitio web para evitar actividades no autorizadas.
Estos son algunos de los principales pros y contras del web scraping.
### Pros:
&#x200B;
1. Puede scrapear contenido de tendencias y bien calificado de varias plataformas con una herramienta de raspado web. Esto puede ayudarlo a ahorrar tiempo y dinero para tratar con múltiples agregadores de contenido.
2. Puede recopilar contenido de las reacciones de la audiencia, como megusta, vistas y compartidos si hay. Los datos de contenido y reacción son valiosos para hacer su matriz de contenido.
3. Puede scrapear contenido de los sitios de sus competidores para analizar la competencia y la estrategia de contenido.
4. Puede construir una base de contenido con una gran escala de recursos. Cuando necesite inspiración o referencias, siempre tiene abundantes recursos a su alcance.
### Cons:
&#x200B;
1. Puede que se necesite un procesamiento adicional para los datos extraídos y que tenga que editar manualmente el formato del contenido por su cuenta, esto podría llevar un poco de tiempo.
2. Los sitios de los que extrajo el contenido pueden bloquear su IP. Es posible que pierda el acceso a estos sitios si lo bloquean.
3. La herramienta no te puede automatizar el proceso de distribución de contenido como lo hacen algunas herramientas de agregación de contenido.
Si está buscando una buena herramienta de web scraping, existen tres herramientas populares de web scraping que no puede perderse.
## [Octoparse](http://www.octoparse.es/)
Octoparse es [una herramienta potente de web scraping](http://www.octoparse.es/)para extraer textos, videos e imágenes de cualquier sitio web. Ofrece plantillas prediseñadas gratuitas para extraer datos de varios sitios web. Eso significa que los usuarios no tienen que configurar un rastreador ellos mismos para extraer la información de sitios web como Amazon, Booking, etc. Solo necesitan elegir una plantilla e ingresar palabras clave o URLs para extraer los campos de datos más extraídos del sitio. Si los usuarios quieren crear un rastreador personalizado, también es fácil de configurar. Simplemente haga clic en la página web para crear uno.
Además, tiene muchas funciones prácticas, como reformateo de datos, programación de tareas, configuración de tareas principales, aceleración de la extracción en la nube, etc. Es una de las herramientas poderosas que puede ayudarlo a extraer contenido de sitios web fácilmente.
## Scraper
Scraper es una extensión de Chrome con funciones limitadas de extracción de datos en comparación con otros softwares. Pero es útil para los usuarios individuales realizar búsquedas en línea. Puede exportar los datos extraídos a Google Spreadsheets directamente.
Además, esta herramienta está diseñada para principiantes en el rastreo web. Puede copiar fácilmente los datos al clipboard o almacenarlos en las hojas de cálculo usando OAuth. La generación automática de XPath es una de las excelentes características que tiene para los principiantes. Si desea datos más precisos, es inevitable que vuelva a escribir el XPath usted mismo.
## ParseHub
Parsehub es un gran raspador web que admite la recopilación de datos de sitios web creados con tecnología AJAX, JavaScript, etc. Es poco probable que ocurran problemas de incompatibilidad web cuando lo usa. Además, tiene una tecnología avanzada de aprendizaje automático que puede ayudarlo a transformar documentos web en datos.
Parsehub es compatible con todos los sistemas operativos populares, como Windows, Mac OS y Linux. No tiene que preocuparse por los usos de multiplataforma. La versión gratuita puede configurar cinco proyectos públicos como máximo. Los planes de suscripción de pago más baratos le permiten crear al menos 20 proyectos privados para scrapear sitios web. Es muy conveniente para usuarios individuales y pequeñas empresas.
# Extraer contenido desde sitios web usando herramientas de agregación de contenido
Una herramienta de agregación de contenido es una aplicación o sitio web que puede ayudarlo a recopilar contenido de una amplia gama de plataformas y luego volver a publicar todo el contenido en un solo lugar. Hay muchos tipos de herramientas de agregación de contenido que se especializan en recopilar diferentes tipos de contenido (noticias deportivas, noticias financieras y noticias de juegos, etc.) o formatos de contenido (video, blogs, podcasts, imágenes, etc.).
Existen algunas ventajas y desventajas importantes de las herramientas de agregación de contenido que debe conocer antes de tomar la decisión.
### Pros:
&#x200B;
1. Algunas herramientas de agregación de contenido pueden personalizar el contenido para usted. Generalmente, esto ayuda a su audiencia a conectarse mejor con su sitio. Y les ayuda a saber que su sitio es el adecuado para ellos.
2. Algunos agregadores de contenido son maestros en la distribución de contenido. Saben muy bien cómo maximizar el alcance del contenido a su audiencia potencial, ayudándole así a atraer más tráfico a sus sitios.
3. Puede dejar la distribución manual de contenido a una herramienta de agregación de contenido, liberándolo así del trabajo manual y tedioso, ayudándole a concentrarse en el trabajo valioso.
4. Una de las mejores cosas de usar agregadores de contenido es que pueden ayudarlo a construir vínculos de retroceso para su sitio y así mejorar su rendimiento de SEO.
### Cons:
&#x200B;
1. Cuando su audiencia lee contenido agregado de otros sitios, puede suscribirse a los sitios originales y dejar su sitio.
2. El uso de agregadores de contenido en su sitio puede aumentar la popularidad de los propietarios del contenido original, no de usted.
3. Sin crear contenido original, puede perder la oportunidad de comprender mejor a sus audiencias y no tendría una comunicación directa con sus audiencias. Esto explica las pérdidas oportunidades de conversión.
4. La función principal de un agregador de contenido es recopilar una gran cantidad de contenido. Por lo tanto, la herramienta en sí no puede ayudar a filtrar el contenido ni garantizar su confiabilidad. Su sitio puede verse afectado por noticias falsas.
## Trapit
Trapit es una herramienta integral de agregación de contenido para empresas que ofrecen diversos temas de contenido. Puede extraer fuentes de texto y video de una amplia gama de sitios web. Además, también ofrece analíticas integradas y herramientas de programación social. Si desea agregar información, investigación y tendencias de la industria para su público en su sitio web o en las plataformas de redes sociales. Es una de las grandes herramientas que no puede perderse.
## BuzzSumo
BuzzSumo es una poderosa herramienta de agregación de contenido en línea que lo mantiene actualizado sobre todos los temas de tendencia en la industria o le permite encontrar contenido popular en cualquier sitio web. Puede buscar el tema que le int...
jkq8ja
u_melisaxinyue
melisaxinyue
t3_jkq8ja
https://www.reddit.com/r/u_melisaxinyue/comments/jkq8ja/2_formas_de_extraer_contenido_de_sitios_web_sin/
10/30/2020 4:26:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
2 Formas de Extraer Contenido de Sitios Web Sin Codificación para Aumentar el Tráfico Web
False
1
jkq8ja
0
7400
5
5
7
0.552922590837283
3
0.23696682464455
0
0
680
53.7124802527646
1266
Red
10
Dash Dot Dot
20
No
901
Posted
7/22/2020 10:24:42 AM
Por favor clic en el artículo original: Descargadores de Imágenes - [Web Spider](http://www.octoparse.es/blog/5-descargadores-de-imagenes-a-granel)
¿Cómo puede descargar imágenes de enlaces de forma **gratuita en lote**?
Para descargar la imagen del enlace, es posible que desee buscar en "Descargadores de imágenes a granel". Inspirado por las consultas recibidas, decidí hacer una lista de "los 5 mejores descargadores de imágenes masivas" para usted. Asegúrese de consultar este artículo Si desea descargar imágenes del enlace sin costo.
## 1. [Tab Save](https://chrome.google.com/webstore/detail/tab-save/lkngoeaeclaebmpkgapchgjdbaekacki?hl=en)
https://preview.redd.it/ljshfr27xdc51.png?width=985&format=png&auto=webp&v=enabled&s=a29743ea53dee8e429eb12104aad823fb80ab601
Valoración Media: ★★★★
Tipo de Aplicación: Extensión de Chrome
Reseñas de productos: Este es el descargador de imágenes que estoy usando. Puede usarlo para guardar el archivo que se muestra en la ventana Después de extraer todas las URL de imágenes, puede ingresarlas todas las URLs si desea descargar archivos rápidamente.
## 2. Bulk Download Images (ZIG)
https://preview.redd.it/cu9rkdk8xdc51.png?width=983&format=png&auto=webp&v=enabled&s=aa025bd007c6c35a906ea618657552caafecf8e5
Valoración Media: ★★★ ½
Tipo de Aplicación: Extensión de Chrome
Reseñas del producto: Puede usarlo para descargar imágenes grandes en lotes en lugar de miniaturas con reglas opcionales. Pero algunos usuarios lo encuentran demasiado complejo y confuso.
## 3. [Image Downloader](https://chrome.google.com/webstore/detail/image-downloader/cnpniohnfphhjihaiiggeabnkjhpaldj?hl=en-US)
https://preview.redd.it/rge1fv6axdc51.png?width=983&format=png&auto=webp&v=enabled&s=22405e81285d7d940745b3f34cb174f11d8c3ad0
Valoración Media: ★★★ ½
Tipo de Aplicación: Extensión de Chrome
Reseñas de Productos: Si necesita descargar imágenes en masa desde una página web, con esta extensión puede descargar imágenes que contiene la página. Muchos usuarios lo encuentran poderoso y fácil de usar.
## 4. Image Downloader Plus
https://preview.redd.it/zr6gytabxdc51.png?width=981&format=png&auto=webp&v=enabled&s=cbc0def7beca06f315ecbf357ec2504ddaf10b9a
Valoración Media: ★★★
Tipo de Aplicación: Extensión de Chrome
Reseñas de Productos: Puede usarlo para descargar y scrape fotos de la web. Le permite descargar las imágenes seleccionadas en una carpeta específica y subirlas a Google Drive. Pero algunos usuarios se quejan de que cambia los nombres de los archivos y cambia el tamaño de las imágenes a un nivel inutilizable.
## 5. [Bulk Image Downloader](https://chrome.google.com/webstore/detail/bulk-image-downloader/facoldpeadablbngjnohbmgaehknhcaj?hl=en-US)
https://preview.redd.it/zqakia9cxdc51.png?width=982&format=png&auto=webp&v=enabled&s=1f00fb730f814794a80c0b8f8fee220659dda352
Valoración media: ★★★
Tipo de Aplicación: Extensión de Chrome
Reseñas de Productos: Puede usarlo para descargar imágenes en masa de una o varias páginas web. Admite la descarga masiva de imágenes desde múltiples pestañas. Puede elegir: todas las pestañas, la pestaña actual, la izquierda de la pestaña actual o la derecha de la pestaña actual.
**¡Estamos abiertos a sugerencias!**
hvrccu
webscraping
melisaxinyue
t3_hvrccu
https://www.reddit.com/r/webscraping/comments/hvrccu/descargar_imágenes_a_granel_desde_el_enlace_top_5/
7/22/2020 10:24:42 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Descargar Imágenes a Granel desde El Enlace - Top 5 Descargadores de Imágenes a Granel
False
0.5
hvrccu
0
7400
5
5
1
0.26246719160105
0
0
0
0
230
60.3674540682415
381
Red
10
Dash Dot Dot
20
No
900
Posted
11/13/2020 10:26:27 AM
"[Big Data](https://www.octoparse.es/blog/10-big-data-analytics-cursos-en-linea)", un término que se hace cada vez más popular entre el público, así como "[data mining](https://www.octoparse.es/blog/10-habilidades-para-data-mining)", un ejemplo práctico de "Big Data". Aunque todo el mundo habla de "Big Data" o "Data Mining", ¿Realmente sabes de qué se trata? Aquí presentaremos brevemente algunos ejemplos de la vida real de cómo Big Data ha impactado nuestras vidas a través de 10 historias interesantes.
&#x200B;
https://preview.redd.it/56e6wsc1izy51.png?width=644&format=png&auto=webp&v=enabled&s=0d434c4f84a1c51df82cce5e18fa67db8f63f4f0
**1. Un caso clásico: Pañal y Cerveza**
Big data está bien empleado para ayudar al departamento de marketing de Walmart con la toma de decisiones. El investigador de mercado de Walmart descubrió que cuando los clientes varones visitan el departamento de bebés para elegir pañales para sus pequeños, es muy probable que compren un par de cervezas. Por lo tanto, Walmart colocó cerveza al lado del pañal lo que había llevado a que las ventas de cervezas y pañales aumentaran significativamente.
&#x200B;
https://preview.redd.it/k3n9lxs1izy51.png?width=940&format=png&auto=webp&v=enabled&s=a61d04b7bfc60d777c31657727f85ee17a67f70b
2. **El fabricante de automóviles mejoró los modelos de vehículos a través de la plataforma de redes sociales**
Big Data trajo impactos a los vehículos de Ford al comienzo del diseño del automóvil.
El equipo de R&D de Ford una vez hizo un análisis sobre las formas de abrir la camioneta trasera de sus SUV (como abrir manualmente o automáticamente). Aunque sus encuestas de rutina no reflejan esto como un problema potencial, el equipo de Ford descubrió que la gente realmente hablaba mucho sobre eso.
&#x200B;
https://preview.redd.it/dsut4pc2izy51.png?width=1080&format=png&auto=webp&v=enabled&s=1f4d008f41bd22e5f1ccc795d115ec9284f26753
**3. Utilice los CCTV para cambiar los menús**
Un restaurante de comida rápida fue lo suficientemente innovador como para cambiar entre los diferentes menús que se muestran en la pantalla grande en función de cuánto tiempo se detecta la cola a través de los CCTV. Basado en un algoritmo preestablecido, los CCTV envían información de la cola a la computadora que luego realiza el cómputo y envía los resultados para controlar qué menú mostrar en la pantalla grande. Por ejemplo, si la línea es larga, la pantalla del menú ofrecerá más opciones de comida rápida y cuando la línea es corta, la pantalla del menú ofrecerá platos que son más rentables pero que pueden tomar más tiempo para prepararse.
**4. Google pronosticó con éxito la gripe invernal**
En 2009, Google estudió los 50 millones de palabras recuperadas más frecuentes y las comparó con los datos de los CDC de las temporadas de gripe 2003-2008 para construir un modelo estadístico. Finalmente, este modelo pronosticó con éxito la propagación de la gripe invernal, incluso específica de los estados.
**5. Big Data sabe más sobre tu preferencia musical**
La música que escucha en el automóvil puede reflejar en cierta medida su preferencia musical. Gracenote obtuvo las técnicas que utilizan los micrófonos integrados en los teléfonos inteligentes y las tabletas para reconocer las canciones reproducidas en el televisor o estéreo del usuario, detectar reacciones como aplausos o abucheos, e incluso detectar si el usuario ha subido el volumen. De esta manera, Gracenote puede estudiar las canciones que les gustan a los usuarios y la hora y el lugar específicos cuando se reproduce esta canción.
&#x200B;
https://preview.redd.it/az02ln23izy51.png?width=830&format=png&auto=webp&v=enabled&s=ddcd99d70c87e23136d2fab3b7e59883dc060e32
**6. Microsoft Big Data predijo con éxito 21 premios Oscar**
En 2013, David Rothschild en Microsoft Institute New York usó Big data para predecir con éxito 19 de 24 premios Oscar y 21 premios Oscar en el año siguiente.
**7. Use Big Data para pronosticar escenas del crimen**
PredPol, trabajando con la policía de Los Ángeles y Santa Cruz y un equipo de investigadores, predice las probabilidades de que ocurra un delito en función de una variación de los algoritmos de predicción de terremotos y los datos del delito que pueden ser precisos dentro de (500 square feet). En Los Ángeles, donde se aplicó el algoritmo, la distribución de robos y delitos violentos se redujo en un 33% y un 21% en consecuencia.
&#x200B;
https://preview.redd.it/cipk6tk3izy51.png?width=660&format=png&auto=webp&v=enabled&s=a2ee418e4c946335b9b505e6ff054182206f1f91
**8. Octoparse utilizó Revisiones para refinar productos**
Octoparse, una empresa de saas dedicada al Web Scraping, siempre tiene en cuenta la sugerencia del cliente. En 2020, Octoparse recolectó decenas de miles de revisiones de clientes, y usó PNL para estudiar las revisiones, y actualizó el producto, y la experiencia del cliente mejoró enormemente.
**9. Encuentra tu amante por Big data**
El Ph.D. matemático Chris McKinlay es un estudiante en UCLA. Después de no encontrar a la chica adecuada después de muchas citas a ciegas, decidió utilizar las matemáticas y los datos para analizar sitios de citas. Con su talento, McKinlay ha creado un programa de robot hecho a sí mismo que utiliza cuentas falsas de OkCupid para recopilar una gran cantidad de información de mujeres de la Web. McKinlay pasó tres semanas recolectando 60,000 preguntas y respuestas de 20,000 mujeres en los Estados Unidos. Luego clasificó a las mujeres usuarias en siete grupos con diferentes sistemas basados en el algoritmo K-Modes mejorado que había desarrollado. Mediante el uso de un modelo matemático para calcular el grado de coincidencia entre usted y dos grupos de mujeres. Mientras salía con la 88na internauta, encontró a su verdadero amor.
**10. Alibaba implementó actos antifalsificación de Big Data**
Alibaba reveló recientemente una serie de casos falsificados. El Departamento de Seguridad de Ali afirmó que "el big data más confiable, de hecho, son los datos de transacciones de la cuenta, la logística y la información de envío". El personal de seguridad de Alibaba dijo que pueden rastrear almacenes fuera de línea a través de consultas sobre direcciones de envío, direcciones IP, direcciones de devolución y más. Los datos de transacciones de la cuenta se pueden divulgar para cada transacción y cada registro de ventas. Incluso si los vendedores usan diferentes ID y tiendas, pueden encontrar comerciantes falsos fuera de línea a través de Big Data. Según el departamento de relaciones públicas de Alibaba, después de años de prácticas, se ha establecido un modelo de represión de big data para monitorear, analizar y tomar medidas enérgicas contra los sistemas de productos falsificados, y actualmente trabajando con la policía para boicotear la circulación de productos falsificados.
Hay tantos usos prácticos de Big data y data mining en nuestras vidas. En pocas palabras, algo que sintió mágico, puede contar con Big Data. Explore historias divertidas sobre Big data en sus vidas, y estamos encantado de hablarlo por usted.
Más artículos relacionados:
[Las 30 Mejores Herramientas de Visualización de Datos en 2020](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos)
[Los 30 Mejores Software Gratuitos de Web Scraping en 2020](https://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping)
[Big Data: 70 Increíbles Fuentes de Datos Gratuitas que Debes Conocer para 2020](https://www.octoparse.es/blog/70-fuentes-de-datos-gratuitas-en-2020)
jternw
u_melisaxinyue
melisaxinyue
t3_jternw
https://www.reddit.com/r/u_melisaxinyue/comments/jternw/data_mining_explicada_con_10_historias/
11/13/2020 10:26:27 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Data Mining Explicada con 10 Historias Interesantes
False
1
jternw
0
7400
5
5
1
0.088261253309797
3
0.264783759929391
0
0
620
54.7219770520741
1133
Red
10
Dash Dot Dot
20
No
899
Posted
8/25/2021 9:14:48 AM
Hoy en día, las personas revisan y comparan productos y servicios en línea antes de realizar una compra. Es obvio que **la experiencia del usuario** es crucial para que las empresas mantengan a los clientes existentes a lo largo del tiempo. Sin embargo, el precio es el factor determinante, especialmente para quienes compran por primera vez. Dicho esto, **el seguimiento de precios** es fundamental para tu negocio.
**Tabla de contenidos**
[¿Qué es el seguimiento de precios?](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#t1)
[¿Cómo ayuda el seguimiento de precios al negocio?](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#t2)
[¿Cuáles son las 10 mejores herramientas de seguimiento de precios?](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#t3)
* [Mozenda](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a1)
* [Import.io](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a2)
* [Octoparse](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a3)
* [Data Crops](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a4)
* [Prisync ](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a5)
* [Omnia Dynamic Pricing](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a6)
* [Price2Spy](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a7)
* [Skuuudle](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a8)
* [Repricer](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a9)
* [Minderest](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a10)
## ¿Qué es el seguimiento de precios?
El seguimiento de precios, también llamado [inteligencia de precios](https://es.wikipedia.org/wiki/Price_intelligence) o seguimiento de precios competitivos, es el análisis de los precios de las variables internas y externas (los precios históricos y en tiempo real de la competencia) con el fin de optimizar la estrategia de precios de uno.
## ¿Cómo ayuda el seguimiento de precios al negocio?
* **Para análisis interno:**
Monitorear tu historial de precios puede ayudar a reflejar la estrategia del mercado. Junto con la rotación de productos y el valor de la marca, el seguimiento de precios puede ayudar a crear la mejor estrategia de precios y maximizar las ganancias.
* **Para** [**el análisis de la competencia del mercado:**](https://www.octoparse.es/blog/3-typical-ways-to-use-web-scraping-tools-for-marketing-decision)
El seguimiento de precios competitivos te permite obtener información de la competencia. Esto es esencial en un informe de mercado. Según la información recopilada, como la relación producto-precio y tu grupo objetivo, tendrás una idea sobre tu posicionamiento en el mercado.
## ¿Cuáles son las 10 mejores herramientas de seguimiento de precios?
En resumen, ¡una herramienta de seguimiento de precios es IMPRESCINDIBLE!
Este artículo presentará las mejores herramientas de seguimiento de precios, también categorizo estas herramientas para que puedas elegir más fácilmente.
[\# 1 Herramienta de w](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div1)[eb scraping](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div1)
[\# 2 Plataforma / software de seguimiento de precios](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div2)
**#1 Herramienta de web scraping**
La herramienta de raspado web es la más rentable para las pequeñas y medianas empresas con un presupuesto limitado. En comparación con un software de seguimiento de precios, las ventajas de una herramienta de raspado web son:
**Escenario múltiple:** Además del seguimiento de precios, una herramienta de raspado web también se puede utilizar en la generación de leads, la gestión de riesgos, la investigación académica y el análisis de mercado.
**Multi-industry:** In addition, a web scraping tool can also be used in many industries including real-estate, [car industry](https://service.octoparse.com/scrape-property-appraisal-data), hospitality, [consultancy](https://service.octoparse.com/pricetrack-consultancy-web-scraping) and more. Rather, a price monitoring software is more unilateral which can only be used in e-commerce.
**Multi-industria:** Además, una herramienta de raspado web también se puede utilizar en muchas industrias, incluidas las inmobiliarias, la industria automovilística, la hotelería, la consultoría y más. Por el contrario, un software de seguimiento de precios es más unilateral y solo se puede utilizar en el comercio electrónico.
Artículo relacionado: [Cómo seguir el precio de la competencia con Web Scraping](https://www.octoparse.es/blog/5-razones-por-web-scraping-puede-beneficiar-negocio)
### [Mozenda](https://mozenda.com/)
**Tipo:** Cliente | **Precio:** desde $ 250 por mes | **Prueba gratuita:** prueba gratuita de 30 días
**Características:**
· Rellenar automáticamente los cuadros de entrada
· [Descarga de imágenes y archivos](http://www.octoparse.es/tutorial-7/extract-data)
· Historial de seguimiento
· Publicación y exportación
· Manejo de errores
· Programación y notificaciones
· API con todas las funciones· Publicación y exportación
· Proxies anónimos
**Caso de uso**: [Seguir tus competidores con Seguimiento De Precios](https://medium.com/@realtoughcandy/greetings-data-scrapers-in-todays-tutorial-i-m-going-to-show-you-how-to-monitor-retail-prices-f2e42558f997)
### [Import.io](https://www.import.io/)
**Tipo:** Extensión complementaria | **Precio:** Personalizado ($ 299 \~ $ 9999) | **Prueba gratuita:** N / A
**Características:**
· API
· Alertas / Notificaciones
· Informes personalizables
· Importación / Exportación de datos
· Visualización de datos
· Informes y estadísticas
Caso de uso: [Cómo monitorear el precio con Import.io.](https://www.import.io/post/how-to-create-a-competitor-price-monitoring-strategy/)
### [Octoparse](http://www.octoparse.com/)
**Tipo:** Cliente | **Precio:** Desde $ 0 \~ $ 249 por mes | **Prueba gratuita:** [prueba gratuita de 14 días](https://www.octoparse.es/pricing)
Caso de uso: Obtener información de precios con la [plantilla de web scraping](http://www.octoparse.es/tutorial-7/empieze-usar-easy-template)
https://preview.redd.it/mjdrvy311hj71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=438c02853b93376fbc0b000b54bd990f9a812998
Al usar la plantilla de raspado web de Octoparse, todos pueden capturar el precio del producto y otra información de Amazon en cualquier momento y en cualquier lugar. El siguiente resultado podría ser lo que puedes obtener con Octoparse.
https://preview.redd.it/uwsbdac31hj71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=e416fe63f5a4e666e0dbc8eb231c6f84b5fa58b2
**#2 Plataforma / software de seguimiento de precios**
La plataforma / software de seguimiento de precios, como su nombre lo indica, se concentra en contribuir a la industria del comercio electrónico para monitorear y rastrear los precios. Eso dice que es un software profesional de pago por uso en comparación con [\# 1 Herramienta de w](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div1)[eb scraping](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div1).
### [Data Crops](https://datacrops.com/)
**Introducción:** Establecida en 2004, Aruhat Technologies es una empresa de software certificada en India con la visión de ofrecer tecnología para mejoras e innovaciones comerciales continuas respaldadas por competencias básicas.
**Tarifa:** Personalizado | **Prueba gratuita:** N / A
**√** Data Crops abarcan desde inteligencia empresarial hasta herramientas de fijación de precios y modificación de precios
**√** Interfaz fácil de usar
× Extracción de datos limitada
× Fallar al extraer datos a veces
**Características:**
· Recopilación de datos dispares
· [Extracción de imágenes](http://www.octoparse.es/tutorial-7/extract-data)
· Extracción de documentos
· Alertas de correo electrónico
### [Prisync ](https://prisync.com/)
**Introducción:** Prisync es una e...
pb7knz
u_melisaxinyue
melisaxinyue
t3_pb7knz
https://www.reddit.com/r/u_melisaxinyue/comments/pb7knz/las_10_mejores_herramientas_de_seguimiento_de/
8/25/2021 9:14:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Las 10 Mejores Herramientas de Seguimiento de Precios en 2021
False
1
pb7knz
0
7400
5
5
4
0.347222222222222
0
0
0
0
654
56.7708333333333
1152
Red
10
Dash Dot Dot
20
No
898
Posted
8/9/2021 3:55:06 AM
SEO (optimización de motor de búsqueda) es el proceso de afectar la visibilidad (el posicionamiento) de un sitio web o una página web en Internet, dentro de los resultados orgánicos en los motores de búsqueda como, por ejemplo, Google. Con ese fin, los buscadores recogen el listado de páginas que hay en la web y lo ordenan en función de su algoritmo.De hecho, esta es la forma gratuita de mejorar tu ranking de Google y atraer más tráfico.
https://preview.redd.it/r5c1ood799g71.png?width=700&format=png&auto=webp&v=enabled&s=c14a5ae433f14c2de0f7f362a499516f91271767
**Tabla de contenidos**
[Mejora de SEO & ranking de Google](https://www.octoparse.es/blog/una-forma-facil-y-gratuita-para-mejorar-tu-ranking-de-google#h1)
[Investigación de palabras clave](https://www.octoparse.es/blog/una-forma-facil-y-gratuita-para-mejorar-tu-ranking-de-google#h2)
[Investigación de Backlinks](https://www.octoparse.es/blog/una-forma-facil-y-gratuita-para-mejorar-tu-ranking-de-google#h3)
## Mejora de SEO & ranking de Google
Un estudio de Infront Webworks mostró que la primera página de Google recibe el 95% del tráfico web web, y las páginas siguientes reciben un 5% o menos del tráfico total. Entonces, para la mayoría de las personas, especialmente para aquellos que desean comenzar su negocio con fondos limitados, el SEO (optimización de motores de búsqueda) es una buena manera de mejorar el ranking de Google para mostrar sus sitios web y atraer a más personas a los sitios web a un costo relativamente bajo.
Sin embargo, el SEO es una gran cosa con muchos factores que afectarían el ranking de Google, como:
* **Factores en la página:** palabra clave en la etiqueta del título, palabra clave en la etiqueta H1, descripción, longitud del contenido, etc.
* **Factores del sitio:** mapa del sitio, confianza del dominio, ubicación del servidor, etc.
* **Factores fuera de la página:** el número de dominios de enlace, la autoridad de dominio de la página de enlace, la autoridad del dominio de enlace, etc.
* **Factores de dominio:** duración del registro del dominio, historial del dominio, etc.
(Nota: Para obtener más detalles, puedes consultar [Los 30 Factores De Ranking De Google Más Importantes Que Un Principiante Debe Saber)](https://unamo.com/blog/seo/30-important-google-ranking-factors-beginner-know)
**La mayoría de estos factores se pueden investigar con herramientas de raspado web de forma gratuita** (consulta [Los 30 Mejores Software Gratuitos de Web Scraping en 2021](https://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping) para obtener más información). Y con suficiente información, podrías desarrollar una mejor estrategia para mejorar tu ranking de Google.
So in this post I would only focus on the keyword and backlinks research to show you how to identify projected traffic and ultimately how to determine value of that ranking in a free and easy way if you don’t have any ideas.
Entonces, en este post solo me centraría en la investigación de palabras clave y backlinks para mostrarte cómo identificar el tráfico proyectado y, en última instancia, cómo determinar el valor de ese ranking de una manera fácil y gratuita si no tienes ninguna idea.
## Investigación de palabras clave
Apuesto a que dirías: "Ah, es fácil. Ya sabes, hay muchas herramientas de búsqueda de palabras clave, como [Keyword Planner](https://www.google.com.hk/intl/en/adwords/?channel=ha-ef&sourceid=awo&subid=hk-en-ha-rhef-skhp0~200331725223&gclid=Cj0KEQjwz_TMBRD0jY-RusGilOYBEiQAN-TuFKvM3258DNURsErkKrAwxzhqdW7kGeSDwae1nDWiZJwaAsHq8P8HAQ&dclid=CI_As8H67tUCFcWlvQodl8wADQ), [Buzzsumo](https://app.buzzsumo.com/research/most-shared), por ejemplo. Todos ellos podrían ayudarme a encontrar las palabras clave más valiosas para orientarme con SEO ".
Sí, está correcto. Pero, ¿cómo podrías juzgar el valor de las palabras clave? ¿Cómo sabes que recibes el tipo de visitantes adecuado?
La respuesta es investigar la demanda de palabras clave de tu mercado, predecir cambios en la demanda y producir contenido que los buscadores web estén buscando activamente. Las herramientas mencionadas anteriormente solo nos mostrarían las palabras clave que los visitantes suelen escribir en los motores de búsqueda. Sin embargo, no pueden mostrarnos directamente lo valioso que es recibir tráfico de esas búsquedas. Para comprender el valor de una palabra clave, debemos comprender nuestros propios sitios web, formular algunas hipótesis, probar y repetir—la fórmula clásica de marketing web. Aquí te mostraría cómo funciona.
Por ejemplo, suponga que has elegido algunas palabras clave específicas y has producido algunos contenidos antes, ahora necesitas medir los efectos. Es decir, cuando los buscadores utilizan estas palabras clave relevantes al realizar búsquedas en Google, encontrarán tu sitio web y accederán a él. Es por eso que primero necesitas conocer tu ranking. Tomaré [Octoparse](https://www.octoparse.es/) por ejemplo, para ilustrar eso.
¿Cómo puedo saber el ranking del dominio Octoparse cuando busco las dos palabras clave relevantes "herramienta gratuita de raspado web" y "servicio gratuito de raspado web"? ¿Y cómo podría conocer el rango de otra información detallada antes de Octoparse para poder conocer mejor el valor de las palabras clave buscadas?
La respuesta es la herramienta de raspado web. Con [la herramienta de raspado web Octoparse](https://www.octoparse.es/), puedes raspar fácilmente la información que deseas buscando las palabras clave (consulta [¿Cómo capturar los términos de búsqueda ingresados y el resultado?](https://www.octoparse.es/tutorial-7/ingresar-una-lista-de-palabras-clave-y-scrape-resultados) Para obtener más detalles).
A continuación se muestra el resultado que obtuve con [Servicio de Nube de Octoparse](https://www.octoparse.es/tutorial-7/que-es-la-extraccion-de-nubes).
https://preview.redd.it/dlv2vfxe99g71.png?width=1238&format=png&auto=webp&v=enabled&s=07d9ebb089ef18e27333cf2dd22a18a3ce941aef
Exporto los datos extraídos a Excel y analizo los datos. Lamentablemente, no encontré el dominio de Octoparse en Excel, aunque encuentro que la mayoría de los visitantes llegan a mi sitio web al buscar estas dos palabras clave a través de la información de análisis de [Google Search Console](https://www.google.com/webmasters/tools/). Ese es el problema con el que se encontraría la mayoría de la gente, pero a menudo lo ignoraban sin darse cuenta. Por lo tanto, es necesario verificar tu ranking con frecuencia y ajustar las estrategias en consecuencia si deseas mejorar tu ranking de Google.
Por ejemplo, para mí, necesito verificar el dominio de estos sitios web y tratar de averiguar si tu Page Rank es más alto que el mío. En caso afirmativo, ¿podrían mis contenidos ser de mayor calidad? Si no es así, ¿qué otros factores podrían optimizarse para mejorar el ranking?
Este es un ejemplo simple que muestra que el uso de la herramienta de raspado web para SEO podría brindarte información valiosa sobre lo difícil que sería clasificar para la palabra clave dada, y también la competencia.
## Investigación de Backlinks
Imagínate a Google como el centro de votación de Internet, contando los votos de todos los enlaces que encuentra en la web. A diferencia de tu democracia típica, donde una persona tiene un voto, Google da más peso a los votos de sitios web relevantes y autorizados. Por lo tanto, el factor más importante para determinar el ranking de Google tiende a basarse en esos pequeños enlaces azules que se ven en casi todos los sitios web.
Entonces, ¿cómo podrías conseguir estos enlaces azules? La forma más común es buscar vínculos de retroceso de la competencia a través de herramientas de SEO como [Open Site Explorer](https://moz.com/researchtools/ose/). Vea los vínculos de retroceso de Octoparse que encuentro en Open Site Explorer a continuación.
https://preview.redd.it/vwvori5g99g71.png?width=1085&format=png&auto=webp&v=enabled&s=d15f4b49a32a1f6f2946d814f13cc699214fe68c
Pero el problema es cómo puedo obtener esta información sin actualizar mi cuenta a una premium. ¡La respuesta es usar una herramien...
p0truj
u_melisaxinyue
melisaxinyue
t3_p0truj
https://www.reddit.com/r/u_melisaxinyue/comments/p0truj/una_forma_fácil_y_gratuita_para_mejorar_tu/
8/9/2021 3:55:06 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Una forma fácil y gratuita para mejorar tu ranking de Google
False
1
p0truj
0
7400
5
5
9
0.72522159548751
0
0
0
0
676
54.4721998388396
1241
Red
10
Dash Dot Dot
20
No
897
Posted
9/8/2021 8:21:05 AM
Estamos inundados de datos y nos resulta difícil presentar el significado detrás de ellos. Aquí es donde entran en escena las herramientas de visualización de datos. Por lo tanto, te proporciono 9 herramientas útiles de visualización de datos para que comprendas tus datos. ¡Espero que este artículo te ayude bien!
**Tabla de Contenidos**
Datawrapper
Tableau
Chart.js
Raw
Infogram
Timeline JS
Plotly
DataHero
Visualize Free
## [Datawrapper](http://www.datawrapper.de/)
&#x200B;
https://preview.redd.it/x41os54zn8m71.png?width=500&format=png&auto=webp&v=enabled&s=ddee1d7fe917ff05e619eeaad5b257951ba66eb8
Datawrapper es una herramienta de visualización de datos en línea para crear gráficos interactivos. Una vez que cargues los datos del archivo CSV o los pegues directamente en el campo, Datawrapper generará una barra, línea o cualquier otra visualización relacionada. Muchos reporteros y organizaciones de noticias usan Datawrapper para integrar gráficos en vivo en sus artículos. Es muy fácil de usar y produce gráficos efectivos.
**Pros**
* Diseñado específicamente para la visualización de datos en salas de redacción
* El plan gratuito es una buena opción para sitios más pequeños
* La herramienta incluye un comprobador de daltonismo incorporado
**Contras**
* Fuentes de datos limitadas
* Los planes pagados son caros
## [Tableau](https://www.tableau.com/)
&#x200B;
https://preview.redd.it/p4czm301o8m71.png?width=190&format=png&auto=webp&v=enabled&s=481c97ef52a2587cd0de6b43ba6b0bf515ad0f6c
Tableau Public es quizás la herramienta de visualización más popular que admite una amplia variedad de cuadros, gráficos, mapas y otros gráficos. Es una herramienta completamente gratuita y los gráficos que crea con ella se pueden incrustar fácilmente en cualquier página web. Tienen una bonita galería que muestra visualizaciones creadas a través de Tableau.
Aunque ofrece cuadros y gráficos que son mucho mejores que otras herramientas similares, no me "encanta" usar su versión gratuita debido al gran pie de página con el que viene. Si no te disgusta tanto como a mí, definitivamente deberías intentarlo. O si puedes pagarlo, puedes optar por una versión paga.
**Pros**
* Cientos de opciones de importación de datos
* Capacidad de mapeo
* Versión pública gratuita disponible
* Muchos videos tutoriales para guiarlo a través de cómo usar Tableau
**Contras**
* Las versiones que no son gratuitas son caras ($ 70 / mes / usuario para el software Tableau Creator)
* La versión pública no te permite mantener privados los análisis de datos
## [Chart.js ](http://www.chartjs.org/)
&#x200B;
https://preview.redd.it/dnblzv03o8m71.jpg?width=398&format=pjpg&auto=webp&v=enabled&s=66c39f786c153e9d96d05ad67ba56a9341df4ce6
Chart.js se adapta perfectamente a proyectos más pequeños. Aunque cuenta con solo seis tipos de gráficos, la biblioteca de código abierto Chart.js es la herramienta de visualización de datos perfecta para pasatiempos y pequeños proyectos. Utilizando elementos de lienzo HTML 5 para representar gráficos, Chart.js crea diseños planos y receptivos y se está convirtiendo rápidamente en una de las bibliotecas de gráficos de código abierto más populares.
**Pros**
* Gratis y de código abierto
* Salida receptiva y compatible con varios navegadores
**Contras**
* Tipos de gráficos muy limitados en comparación con otras herramientas
* Soporte limitado fuera de la documentación oficial
## [RAWGraphs](http://rawgraphs.io/)
&#x200B;
https://preview.redd.it/n8rj7834o8m71.png?width=400&format=png&auto=webp&v=enabled&s=f0adad20691a0880a32b3874d07dcfc2fc14dfe8
Raw se define a sí mismo como "el link perdido entre las hojas de cálculo y los gráficos vectoriales". Está construido sobre D3.js y está extremadamente bien diseñado. Tiene una interfaz tan intuitiva que sentirá que la has usado antes. Es de código abierto y no requiere ningún registro.
Tiene una biblioteca de 21 tipos de gráficos para elegir y todo el procesamiento se realiza en el navegador. Entonces tus datos están seguros. RAW es altamente personalizable y extensible, e incluso puede aceptar nuevos diseños personalizados.
**Pros**
* Gratis ycódigo abierto
* Intuitivo y eficiente
* Tiene documento de ayuda
**Contras**
* No tiene muchas opciones ajustables
## [Infogram](https://infogram.com/)
&#x200B;
https://preview.redd.it/8nblz2a5o8m71.png?width=267&format=png&auto=webp&v=enabled&s=fa5a4aa43ff593045b2900e56aef997f1924e186
Infogram te permite crear gráficos e infografías en línea. Tiene una versión gratuita restringida y dos opciones de pago que incluyen funciones como más de 200 mapas, uso compartido privado y biblioteca de iconos, etc.
Viene con una interfaz fácil de usar y sus gráficos básicos están bien diseñados. Una característica que no me gustó es el enorme logotipo que aparece cuando intentas insertar gráficos interactivos en tu página web (en la versión gratuita). Será mejor si pueden hacerlo como el pequeño texto que usa Datawrapper.
**Pros**
* Precios escalonados, incluido un plan gratuito con funciones básicas
* Incluye más de 35 tipos de gráficos y más de 550 tipos de mapas
* Editor de arrastrar y soltar
* API para importar fuentes de datos adicionales
**Contras**
* Significativamente menos fuentes de datos integradas que otras aplicaciones
## [Timeline JS](https://timeline.knightlab.com/)
&#x200B;
https://preview.redd.it/m61fz836o8m71.png?width=243&format=png&auto=webp&v=enabled&s=a30942a9d1f09ec82a953643377d1e3cac79fa98
Como sugiere el nombre, Timeline JS te ayuda a crear hermosas líneas de tiempo sin escribir ningún código. Es una herramienta gratuita de código abierto que utilizan algunos de los sitios web más populares como Time y Radiolab.
Es un proceso de cuatro pasos muy fácil de seguir para crear su línea de tiempo que se explica aquí. ¿Mejor parte? Puede extraer medios de una variedad de fuentes y tiene soporte integrado para Twitter, Flickr, Google Maps, YouTube, Vimeo, Vine, Dailymotion, Wikipedia, SoundCloud y otros sitios similares.
**Pros**
* Hacer una historia ilustrativa con TimelineJS no es complejo y, a veces, podría darte un buen resultado
**Contras**
* No es flexible y no da mucho espacio para ser creativo
* También, es difícil adaptarlo bien a su sitio web
## [Plotly](https://plot.ly/)
&#x200B;
https://preview.redd.it/qej7fkk6o8m71.png?width=400&format=png&auto=webp&v=enabled&s=d2adf733baa18f1e8359bfc8be51b7f94de5c59b
Plotly es una herramienta de análisis y gráficos de datos basada en la web. Admite una buena colección de tipos de gráficos con funciones integradas para compartir en redes sociales. Los cuadros y tipos de gráficos disponibles tienen un aspecto profesional. Crear un gráfico es solo una cuestión de cargar su información y personalizar el diseño, los ejes, las notas y la leyenda. Si estás buscando comenzar, puedes encontrar algo de inspiración aquí.
**Pros**
* Figuras creadas hermosas, interactivas y exportables con solo unas pocas líneas de código
* Mucho más interactivo y visualmente flexible que Matplotlib o Seaborn
**Contras**
* Configuración inicial confusa para usar Plotly sin una cuenta en línea
* Mucho código para escribir
## [DataHero](https://datahero.com/)
&#x200B;
https://preview.redd.it/9mt5qt97o8m71.jpg?width=200&format=pjpg&auto=webp&v=enabled&s=c4f756369bc7627dccbaafdb1f1324d6834e9b42
DataHero te permite reunir datos de servicios en la nube y crear gráficos y paneles. No se requieren habilidades técnicas, por lo que esta es una gran herramienta para que la use todo tu equipo.
**Pros**
* Capacidad de conectarse a otras plataformas y tener esos datos actualizados diariamente
* Interfaz de usuario sencilla, muchas opciones e integraciones con otras aplicaciones
* Funcionalidad de exportación y rapidez
**Contras**
* La tarea de mostrar datos duros de una manera elegante y sencilla no es fácil de entender para todos
* Necesitan mejores consejos sobre cómo generar tablas por encima del promedio
## [Visua...
pk6d11
u_melisaxinyue
melisaxinyue
t3_pk6d11
https://www.reddit.com/r/u_melisaxinyue/comments/pk6d11/las_9_mejores_herramientas_de_visualización_de/
9/8/2021 8:21:05 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Las 9 mejores herramientas de visualización de datos para no desarrolladores
False
1
pk6d11
0
7400
5
5
13
1.22989593188269
1
0.0946073793755913
0
0
609
57.6158940397351
1057
Red
10
Dash Dot Dot
20
No
896
Posted
11/13/2020 9:51:20 AM
¿Cómo puede descargar imágenes de enlaces de forma **gratuita en lote**?
Para descargar la imagen del enlace, es posible que desee buscar en "Descargadores de imágenes a granel". Inspirado por las consultas recibidas, decidí hacer una lista de "los 5 mejores descargadores de imágenes masivas" para usted. Asegúrese de consultar este artículo Si desea descargar imágenes del enlace sin costo. (Si no está seguro de cómo extraer las URL de las imágenes, consulte esto: [**Cómo construir un rastreador de imágenes sin codificación**](https://www.octoparse.com/blog/build-an-image-crawler-without-coding))
## 1. [Tab Save](https://chrome.google.com/webstore/detail/tab-save/lkngoeaeclaebmpkgapchgjdbaekacki?hl=en)
##
https://preview.redd.it/c7krrqdnbzy51.png?width=985&format=png&auto=webp&v=enabled&s=9e596726670c1abc420c96439c18920bf0f93b5f
Valoración Media: ★★★★
Tipo de Aplicación: Extensión de Chrome
Reseñas de productos: Este es el descargador de imágenes que estoy usando. Puede usarlo para guardar el archivo que se muestra en la ventana Después de extraer todas las URL de imágenes, puede ingresarlas todas las URLs si desea descargar archivos rápidamente.
## 2. Bulk Download Images (ZIG)
&#x200B;
https://preview.redd.it/09nmxrjpbzy51.png?width=983&format=png&auto=webp&v=enabled&s=fd81d768977834b5b06deb2a27722a9ccc9cc52b
Valoración Media: ★★★ ½
Tipo de Aplicación: Extensión de Chrome
Reseñas del producto: Puede usarlo para descargar imágenes grandes en lotes en lugar de miniaturas con reglas opcionales. Pero algunos usuarios lo encuentran demasiado complejo y confuso.
## 3. [Image Downloader](https://chrome.google.com/webstore/detail/image-downloader/cnpniohnfphhjihaiiggeabnkjhpaldj?hl=en-US)
&#x200B;
https://preview.redd.it/6fb3fr6qbzy51.png?width=983&format=png&auto=webp&v=enabled&s=5d48ed7e526565f65c023f6df4725654170d6f08
Valoración Media: ★★★ ½
Tipo de Aplicación: Extensión de Chrome
Reseñas de Productos: Si necesita descargar imágenes en masa desde una página web, con esta extensión puede descargar imágenes que contiene la página. Muchos usuarios lo encuentran poderoso y fácil de usar.
## 4. Image Downloader Plus
https://preview.redd.it/bsmf1jwqbzy51.png?width=981&format=png&auto=webp&v=enabled&s=79ed1b16c57ec73b3bb7142333e565ffa8ada6bd
Valoración Media: ★★★
Tipo de Aplicación: Extensión de Chrome
Reseñas de Productos: Puede usarlo para descargar y scrape fotos de la web. Le permite descargar las imágenes seleccionadas en una carpeta específica y subirlas a Google Drive. Pero algunos usuarios se quejan de que cambia los nombres de los archivos y cambia el tamaño de las imágenes a un nivel inutilizable.
## 5. [Bulk Image Downloader](https://chrome.google.com/webstore/detail/bulk-image-downloader/facoldpeadablbngjnohbmgaehknhcaj?hl=en-US)
&#x200B;
https://preview.redd.it/oqm33xrrbzy51.png?width=982&format=png&auto=webp&v=enabled&s=00865f485988b04ca692c47d370cc83feb4f5596
Valoración media: ★★★
Tipo de Aplicación: Extensión de Chrome
Reseñas de Productos: Puede usarlo para descargar imágenes en masa de una o varias páginas web. Admite la descarga masiva de imágenes desde múltiples pestañas. Puede elegir: todas las pestañas, la pestaña actual, la izquierda de la pestaña actual o la derecha de la pestaña actual.
**¡Estamos abiertos a sugerencias!**
Si tiene alguna sugerencia, envíenos un correo electrónico a support@octoparse.com
jteew1
u_melisaxinyue
melisaxinyue
t3_jteew1
https://www.reddit.com/r/u_melisaxinyue/comments/jteew1/descargar_imágenes_a_granel_desde_el_enlace_5/
11/13/2020 9:51:20 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Descargar Imágenes a Granel desde El Enlace - 5 Mejores Downloaders
False
1
jteew1
0
7400
5
5
1
0.244498777506112
0
0
0
0
245
59.9022004889976
409
Red
10
Dash Dot Dot
20
No
895
Posted
10/30/2020 4:27:58 AM
**Tableta de Contenido**
[3 Usos Prácticos de Datos de Ecommerce](http://www.octoparse.es/blog/3-usos-pr%C3%A1cticos-de-herramientas-de-web-scraping-de-datos-de-comercio-electr%C3%B3nico#h1)
[3 herramientas populares de scrapear datos de eCommerce](http://www.octoparse.es/blog/3-usos-pr%C3%A1cticos-de-herramientas-de-web-scraping-de-datos-de-comercio-electr%C3%B3nico#h2)
[Conclusión](http://www.octoparse.es/blog/3-usos-pr%C3%A1cticos-de-herramientas-de-web-scraping-de-datos-de-comercio-electr%C3%B3nico#h3)
En el mundo actual del comercio electrónico, las herramientas de extracción de datos de comercio electrónico ganan gran popularidad en todo el mundo a medida que la competencia entre los propietarios de negocios de comercio electrónico se vuelve más feroz cada año. Las herramientas de extracción de datos se convierten en la nueva técnica y herramienta para ayudarlos a mejorar su rendimiento.
Muchos propietarios de tiendas usan una herramienta de extracción de datos de comercio electrónico para monitorear las actividades del competidor y el comportamiento de los clientes puede ayudarlos a mantener su competitividad y mejorar las ventas. Si no tiene idea de cómo hacer un uso completo de las herramientas de extracción de datos de comercio electrónico, quédese conmigo y analizaremos los 3 usos más prácticos de una herramienta de extracción y cómo esta herramienta ayuda a hacer crecer su negocio.
## Tres Usos Prácticos de Datos de Comercio Electrónico
## 1) [Monitoreo de Precio](http://www.octoparse.es/blog/top-10-price-monitoring-tool)
El precio es uno de los aspectos más críticos que afectan el interés de compra de los clientes. El 87% de los compradores en línea indican que el precio es el factor más importante que afecta los comportamientos de compra, seguido del costo de envío y la velocidad. Esa investigación sugiere que un cliente potencial no dudará en salir de su tienda si su precio no coincide con sus expectativas.
Además, según un estudio de AYTM, el 78 por ciento de los compradores comparan precios entre dos o más marcas y luego optan por el precio más bajo. Con acceso fácil a muchas herramientas gratuitas de comparación de precios en línea, los compradores en línea pueden ver fácilmente el precio de un artículo específico en docenas de marcas y mercados.
Es necesario que los propietarios de negocios en línea tengan una herramienta de extracción de datos de comercio electrónico para extraer información de precios de las páginas web del competidor o de aplicaciones de comparación de precios. De lo contrario, es probable que tenga problemas para atraer nuevos clientes a su tienda o mantener su base de clientes actual, porque no sabe cuándo ni cómo ajustar su precio para atender a esos clientes sensibles al precio.
📷
## 2) [Análisis del Competidor](http://www.octoparse.es/blog/competitor-monitoring-for-price-strategy-and-product-planning)
Somos conscientes de que mejorar el servicio de envío es otra solución para aumentar las ventas. El 56% de los vendedores en línea ofrecen envío gratuito (y devoluciones fáciles) independientemente del precio de compra o del tipo de producto.
Muchos vendedores online utilizan el envío gratuito como estrategia de marketing para animar a las personas a que les compren o incluso que les compren más. Por ejemplo, es bastante común que los clientes estén más dispuestos a gastar $ 100 en un producto con envío gratuito en lugar de comprar un producto de $ 90 que cuesta $ 10 por el envío. Además, es común que los clientes compren más artículos para obtener una oferta de envío gratis.
Puede utilizar una herramienta de extracción de datos de comercio electrónico para averiguar cuántos de sus competidores ofrecen un servicio de envío gratuito. Con una herramienta de extracción de datos, puede extraer y recopilar fácilmente los datos en tiempo real. En este caso, si no brindan un servicio de envío gratuito, puede atraer a sus clientes ofreciéndolo.
## 3) [Análisis del Sentimiento](http://www.octoparse.es/blog/text-mining-and-sentiment-analysis-using-python) del Cliente
Saber cómo se sienten las audiencias de sus competidores sobre los productos o las marcas puede ayudarlo a evaluar su estrategia de marketing y la gestión de la experiencia del cliente. Las herramientas de extracción de datos de comercio electrónico pueden ayudarlo a recopilar dicha información.
Las voces de los clientes que recopila de sus competidores lo ayudarán a comprender qué valoran los clientes y cómo puede brindarles un mejor servicio. Sus voces se encuentran en su mayoría dispersas entre los comentarios y las conversaciones en las tiendas y publicaciones e interacciones de sus competidores en sus redes sociales. Con dicha información a mano, sabrá qué quieren los clientes del producto y qué es lo que les gusta o que no les gusta.
Para superar a sus competidores, es necesario que obtenga toda esa información, la investigue y saque conclusiones. Por lo tanto, puede ajustar su estrategia de marketing o sus productos / servicios según ella.
📷*Ahora puede que se esté preguntando qué herramientas de raspado se pueden utilizar para estos fines. Aquí, me gustaría compartir con usted esta lista corta de las herramientas de extracción de datos de comercio electrónico más populares. ¡Debería probarlos!*
## 3 herramientas de scrapear datos de comercio electrónico
## 1) [Octoparse](http://www.octoparse.es/)
[Octoparse](http://www.octoparse.es/) es una herramienta potente y gratuita de extracción de datos de comercio electrónico con una interfaz de apuntar y hacer clic que es fácil de usar. Tanto los usuarios de Windows como de Mac lo encontrarán fácil de usar para extraer casi todo tipo de datos que necesita de un sitio web. Con su nuevo algoritmo de detección automática, los usuarios con o sin conocimientos de codificación pueden extraer gran cantidad de datos en unos segundos.
**Pros:** Octoparse proporciona más de 50 [modelos prediseñados](http://www.octoparse.es/blog/big-announcement-web-scraping-template-take-away) para todos los usuarios, abarcando grandes sitios web como Amazon, Facebook, Twitter, Instagram, Walmart, etc. Todo lo que necesita hacer es introducir las palabras clave y la URL, luego esperar el resultado de los datos. Además, proporciona una versión gratuita para todas las personas. Los usuarios premium pueden utilizar funciones como la programación del rastreador y [cloud extraction](https://helpcenter.octoparse.com/hc/en-us/articles/360018047092-What-is-Cloud-Extraction-) para que el proceso requiera menos tiemp.
**Cons:** Octoparse no puede extraer datos de archivos PDF. No puede descargar archivos automáticamente, mientras que le permite [extraer las URLs de imágenes](https://helpcenter.octoparse.com/hc/en-us/articles/360018047452-Can-Octoparse-extract-images-videos-files-), PDF y otros tipos de archivos. Puede utilizar un software de descarga automática para [descargar estos archivos de forma masiva](https://helpcenter.octoparse.com/hc/en-us/articles/360018324071-How-to-download-images-from-a-list-of-URLs-) con la URL extraída por Octoparse.
## 2) [Parsehub](https://www.parsehub.com/)
ParseHub funciona con aplicaciones de una sola página, aplicaciones de varias páginas y otras tecnologías web modernas. ParseHub puede manejar Javascript, AJAX, cookies, sesiones y redirecciones. ParseHub can handle Javascript, AJAX, cookies, sessions, and redirects. Puede completar formularios fácilmente, [loop through dropdowns](https://helpcenter.octoparse.com/hc/en-us/articles/360018281571-How-to-click-through-options-in-a-drop-down-menu-), [login to websites](https://helpcenter.octoparse.com/hc/en-us/articles/360018008832-Text-keyword-input), hacer clic en mapas interactivos y tratar con sitios web que aplican [técnicas de desplazamiento infinito](https://helpcenter.octoparse.com/hc/en-us/articles/360018281551-Dealing-with-Infinitive-Scrolling-Load-More).
**Pros:** Parsehub es compatible con los sistemas Windows y Mac OS. Proporciona una versión gratuita para personas con necesidades de extracción de datos de comercio el...
jkq94l
u_melisaxinyue
melisaxinyue
t3_jkq94l
https://www.reddit.com/r/u_melisaxinyue/comments/jkq94l/los_3_usos_más_prácticos_de_herramienta_de_web/
10/30/2020 4:27:58 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Los 3 Usos Más Prácticos de Herramienta de Web Scraping de Datos de Comercio Electrónico
False
1
jkq94l
0
7400
5
5
4
0.314218381775334
3
0.2356637863315
0
0
689
54.1241162608013
1273
Red
10
Dash Dot Dot
20
No
894
Posted
8/25/2021 9:14:48 AM
Hoy en día, las personas revisan y comparan productos y servicios en línea antes de realizar una compra. Es obvio que **la experiencia del usuario** es crucial para que las empresas mantengan a los clientes existentes a lo largo del tiempo. Sin embargo, el precio es el factor determinante, especialmente para quienes compran por primera vez. Dicho esto, **el seguimiento de precios** es fundamental para tu negocio.
**Tabla de contenidos**
[¿Qué es el seguimiento de precios?](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#t1)
[¿Cómo ayuda el seguimiento de precios al negocio?](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#t2)
[¿Cuáles son las 10 mejores herramientas de seguimiento de precios?](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#t3)
* [Mozenda](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a1)
* [Import.io](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a2)
* [Octoparse](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a3)
* [Data Crops](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a4)
* [Prisync ](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a5)
* [Omnia Dynamic Pricing](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a6)
* [Price2Spy](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a7)
* [Skuuudle](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a8)
* [Repricer](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a9)
* [Minderest](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#a10)
## ¿Qué es el seguimiento de precios?
El seguimiento de precios, también llamado [inteligencia de precios](https://es.wikipedia.org/wiki/Price_intelligence) o seguimiento de precios competitivos, es el análisis de los precios de las variables internas y externas (los precios históricos y en tiempo real de la competencia) con el fin de optimizar la estrategia de precios de uno.
## ¿Cómo ayuda el seguimiento de precios al negocio?
* **Para análisis interno:**
Monitorear tu historial de precios puede ayudar a reflejar la estrategia del mercado. Junto con la rotación de productos y el valor de la marca, el seguimiento de precios puede ayudar a crear la mejor estrategia de precios y maximizar las ganancias.
* **Para** [**el análisis de la competencia del mercado:**](https://www.octoparse.es/blog/3-typical-ways-to-use-web-scraping-tools-for-marketing-decision)
El seguimiento de precios competitivos te permite obtener información de la competencia. Esto es esencial en un informe de mercado. Según la información recopilada, como la relación producto-precio y tu grupo objetivo, tendrás una idea sobre tu posicionamiento en el mercado.
## ¿Cuáles son las 10 mejores herramientas de seguimiento de precios?
En resumen, ¡una herramienta de seguimiento de precios es IMPRESCINDIBLE!
Este artículo presentará las mejores herramientas de seguimiento de precios, también categorizo estas herramientas para que puedas elegir más fácilmente.
[\# 1 Herramienta de w](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div1)[eb scraping](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div1)
[\# 2 Plataforma / software de seguimiento de precios](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div2)
**#1 Herramienta de web scraping**
La herramienta de raspado web es la más rentable para las pequeñas y medianas empresas con un presupuesto limitado. En comparación con un software de seguimiento de precios, las ventajas de una herramienta de raspado web son:
**Escenario múltiple:** Además del seguimiento de precios, una herramienta de raspado web también se puede utilizar en la generación de leads, la gestión de riesgos, la investigación académica y el análisis de mercado.
**Multi-industry:** In addition, a web scraping tool can also be used in many industries including real-estate, [car industry](https://service.octoparse.com/scrape-property-appraisal-data), hospitality, [consultancy](https://service.octoparse.com/pricetrack-consultancy-web-scraping) and more. Rather, a price monitoring software is more unilateral which can only be used in e-commerce.
**Multi-industria:** Además, una herramienta de raspado web también se puede utilizar en muchas industrias, incluidas las inmobiliarias, la industria automovilística, la hotelería, la consultoría y más. Por el contrario, un software de seguimiento de precios es más unilateral y solo se puede utilizar en el comercio electrónico.
Artículo relacionado: [Cómo seguir el precio de la competencia con Web Scraping](https://www.octoparse.es/blog/5-razones-por-web-scraping-puede-beneficiar-negocio)
### [Mozenda](https://mozenda.com/)
**Tipo:** Cliente | **Precio:** desde $ 250 por mes | **Prueba gratuita:** prueba gratuita de 30 días
**Características:**
· Rellenar automáticamente los cuadros de entrada
· [Descarga de imágenes y archivos](http://www.octoparse.es/tutorial-7/extract-data)
· Historial de seguimiento
· Publicación y exportación
· Manejo de errores
· Programación y notificaciones
· API con todas las funciones· Publicación y exportación
· Proxies anónimos
**Caso de uso**: [Seguir tus competidores con Seguimiento De Precios](https://medium.com/@realtoughcandy/greetings-data-scrapers-in-todays-tutorial-i-m-going-to-show-you-how-to-monitor-retail-prices-f2e42558f997)
### [Import.io](https://www.import.io/)
**Tipo:** Extensión complementaria | **Precio:** Personalizado ($ 299 \~ $ 9999) | **Prueba gratuita:** N / A
**Características:**
· API
· Alertas / Notificaciones
· Informes personalizables
· Importación / Exportación de datos
· Visualización de datos
· Informes y estadísticas
Caso de uso: [Cómo monitorear el precio con Import.io.](https://www.import.io/post/how-to-create-a-competitor-price-monitoring-strategy/)
### [Octoparse](http://www.octoparse.com/)
**Tipo:** Cliente | **Precio:** Desde $ 0 \~ $ 249 por mes | **Prueba gratuita:** [prueba gratuita de 14 días](https://www.octoparse.es/pricing)
Caso de uso: Obtener información de precios con la [plantilla de web scraping](http://www.octoparse.es/tutorial-7/empieze-usar-easy-template)
https://preview.redd.it/mjdrvy311hj71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=438c02853b93376fbc0b000b54bd990f9a812998
Al usar la plantilla de raspado web de Octoparse, todos pueden capturar el precio del producto y otra información de Amazon en cualquier momento y en cualquier lugar. El siguiente resultado podría ser lo que puedes obtener con Octoparse.
https://preview.redd.it/uwsbdac31hj71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=e416fe63f5a4e666e0dbc8eb231c6f84b5fa58b2
**#2 Plataforma / software de seguimiento de precios**
La plataforma / software de seguimiento de precios, como su nombre lo indica, se concentra en contribuir a la industria del comercio electrónico para monitorear y rastrear los precios. Eso dice que es un software profesional de pago por uso en comparación con [\# 1 Herramienta de w](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div1)[eb scraping](https://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios#div1).
### [Data Crops](https://datacrops.com/)
**Introducción:** Establecida en 2004, Aruhat Technologies es una empresa de software certificada en India con la visión de ofrecer tecnología para mejoras e innovaciones comerciales continuas respaldadas por competencias básicas.
**Tarifa:** Personalizado | **Prueba gratuita:** N / A
**√** Data Crops abarcan desde inteligencia empresarial hasta herramientas de fijación de precios y modificación de precios
**√** Interfaz fácil de usar
× Extracción de datos limitada
× Fallar al extraer datos a veces
**Características:**
· Recopilación de datos dispares
· [Extracción de imágenes](http://www.octoparse.es/tutorial-7/extract-data)
· Extracción de documentos
· Alertas de correo electrónico
### [Prisync ](https://prisync.com/)
**Introducción:** Prisync es una e...
pb7knz
u_melisaxinyue
melisaxinyue
t3_pb7knz
https://www.reddit.com/r/u_melisaxinyue/comments/pb7knz/las_10_mejores_herramientas_de_seguimiento_de/
8/25/2021 9:14:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Las 10 Mejores Herramientas de Seguimiento de Precios en 2021
False
1
pb7knz
0
7400
5
5
Red
10
Dash Dot Dot
20
No
893
Posted
9/6/2021 7:11:43 AM
**Tabla de Contenidos**
1. [¿Por Qué Utilizar El Servidor Proxy Para El Web Scraping?](https://www.octoparse.es/blog/utilizar-el-servidor-proxy-para-web-scraping#div1)
2. [La Fiabilidad Del Proxy](https://www.octoparse.es/blog/utilizar-el-servidor-proxy-para-web-scraping#div2)
3. [Web Scraping En La Nube](https://www.octoparse.es/blog/utilizar-el-servidor-proxy-para-web-scraping#div3)
4. [Web Scrapers Populares Para Evitar El Bloqueo De IP](https://www.octoparse.es/blog/utilizar-el-servidor-proxy-para-web-scraping#div4)
* Octoparse
* Import.io
* Webhose.io
* Screen Scraper
https://preview.redd.it/v57kcpwr1ul71.png?width=1600&format=png&auto=webp&v=enabled&s=f5e00de70e67d15e856daf21c7960f00a1f5b959
## ¿Por Qué Utilizar El Servidor Proxy Para El Web Scraping?
Web Scraper o spider se vuelve cada vez más popular en la ciencia de datos. Esta técnica automática puede ayudarnos a recuperar una gran cantidad de datos personalizados de la Web o de la base de datos. Sin embargo, el problema principal es que el sitio web puede rastrear fácilmente la solicitud de demasiadas páginas en un período de tiempo demasiado corto mediante una única dirección IP, por lo que el sitio web de destino puede bloquearlo. Para limitar las posibilidades de ser bloqueado, debemos intentar evitar raspar un sitio web con una única dirección IP. Y normalmente, utilizamos servidores proxy que incluyen direcciones IP de proxy discretas siempre que las solicitudes se enrutan a través del servidor de rastreo.
## La Fiabilidad Del Proxy
Preocupados por el servidor proxy, la fiabilidad del proxy siempre debe ser lo primero en nuestra mente. En realidad, hay alrededor de 1000 lugares para comprar proxies y algunos proxies poco confiables irían demasiado rápido, lo que podría causar que se bloqueen. También hay otros enfoques que pueden estar más relacionados con la subcontratación de la rotación de IP (piensa en el proxy como un servicio), pero estos servicios generalmente tienen un costo más alto. Dado que existe un costo de comprar el proxy y el costo de volver a implementar el proxy cada vez que compra uno nuevo. Con mucha frecuencia, la confiabilidad tiene un costo y, a menudo, encontrará que "gratis" será muy poco confiable, "barato" será algo poco confiable y "más costoso" generalmente tendrá un costo adicional. Por lo tanto, recientemente se ha propuesto el concepto de extracción de datos basada en la nube.
## Web Scraping En La Nube
Web Scraping basado en la nube es un verdadero servicio basado en la nube, puede ejecutarse desde cualquier sistema operativo y cualquier navegador. No tenemos que alojar nada nosotros mismos y todo se hace en la nube. Además, todas las visitas a la página del sitio web, la formación de datos y la transformación se pueden manejar en el servidor de otra persona. Los requisitos de proxy web pueden ser gestionados por nosotros mismos.
En el lado de la nube, estas máquinas son independientes, se puede acceder a ellas y ejecutarlas sin necesidad de instalarlas desde cualquier PC con acceso a Internet en todo el mundo. Este servicio administrará nuestros datos con un increíble hardware de back-end, más específicamente, podemos utilizar su función de proxy anónimo que podría rotar toneladas de direcciones IP para evitar ser bloqueadas por el sitio web de destino.
## Web Scrapers Populares Para Evitar El Bloqueo De IP
En realidad, podemos adoptar un enfoque más conciso y eficiente mediante el uso de cierta herramienta Data Scraper con servicios basados en la nube, como [Octoparse](https://www.octoparse.es/), [Import.io](https://www.import.io/). Estas herramientas pueden programar y ejecutar tu tarea en cualquier momento en el lado de la nube con toneladas de PC ejecutándose en el Mismo tiempo. Además, estas herramientas de raspador también pueden proporcionarnos una forma rápida de configurar manualmente estos servidores proxy según lo necesites. Aquí hay un tutorial que presenta cómo [configurar proxies](http://www.octoparse.es/tutorial-7/set-up-proxies) en Octoparse.
Algunas herramientas de raspador populares en el mercado incluyen Octoparse, Import.io, Webhose.io, Screen Scraper.
### 1. [Octoparse](https://www.octoparse.es/)
https://preview.redd.it/oskalw3w1ul71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=5f51ff5cdaf27c877b43219c64d3065c67ea574e
Octoparse es una herramienta de rastreo de datos poderosa y gratuita que puede rastrear casi todos los sitios web. Su extracción de datos basada en la nube puede proporcionar servidores proxy de dirección IP rotativos ricos para web scraping, lo que ha limitado las posibilidades de ser bloqueado y ahorrado mucho tiempo para la configuración manual. Han proporcionado instrucciones precisas y pautas claras para seguir los pasos de raspado. Básicamente, para esta herramienta, no es necesario tener habilidades de codificación. De todos modos, si deseas profundizar y fortalecer tu rastreo y raspado, ha ofrecido una [API pública](http://dataapi.octoparse.com/help) si lo necesitas. Además, su soporte de respaldo es eficiente y está disponible.
### 2. [Import.io](https://www.import.io/)
Import.io también es un raspador de datos de escritorio fácil de usar. Tiene una interfaz de usuario sucinta y eficaz y una navegación sencilla. Para esta herramienta, también requiere menos habilidades de codificación. Import.io también posee muchas características poderosas, como el servicio basado en la nube que puede ayudarnos a cuidar mejor de nuestra tarea programada y mejorar nuestra capacidad de minería para su dirección IP rotativa. Sin embargo, Improt.io tiene dificultades para navegar a través de combinaciones de javascript / POST.
### 3. [Webhose.io](https://webhose.io/)
Webhose.io es una herramienta de rastreo de datos basada en navegador que utiliza varias técnicas de rastreo de datos para rastrear cantidades de datos de múltiples canales. Si bien puede que no se comporte tan bien como las herramientas introducidas anteriormente sobre su servicio en la nube, lo que significa que el proceso de raspado relacionado con la rotación de IP o la configuración del proxy puede ser algo complejo. Han proporcionado un plan de servicio gratuito y de pago según lo necesites.
### 4. [Screen Scraper](https://www.screen-scraper.com/)
Screen Scraper es bastante ordenado y puede lidiar con ciertas tareas difíciles, incluida la localización precisa, la navegación y la extracción de datos, sin embargo, requiere que tengas habilidades básicas de programación / tokenización si deseas que funcione al máximo. Implica que debes configurar los ajustes y establecer los parámetros manualmente la mayor parte del tiempo, las ventajas de que puede personalizar tu proceso de minería distintivo, mientras que las desventajas son que requiere un poco de tiempo y es complejo. Además, es un poco caro.
piugxt
u_melisaxinyue
melisaxinyue
t3_piugxt
https://www.reddit.com/r/u_melisaxinyue/comments/piugxt/web_scraping_utilizar_el_servidor_proxy_para_web/
9/6/2021 7:11:43 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping | Utilizar el servidor proxy para Web Scraping
False
1
piugxt
0
7400
5
5
1
0.0956022944550669
0
0
0
0
573
54.7801147227533
1046
Red
10
Dash Dot Dot
20
No
892
Posted
11/13/2020 9:53:49 AM
En el campo comercial, se puede utilizar una gran cantidad de scraped data para el análisis empresarial. Podemos scrape los detalles, como precio, stock, calificación, etc., cubriendo varios campos de datos para monitorear cambios en bienes. Estos datos scraped pueden ayudar aún más a los analistas y vendedores del mercado a evaluar el valor potencial o tomar decisiones más significativas.
**Sin embargo, no podemos scrape todos los datos con las API del website**
Algunos sitios web proporcionan API para que los usuarios accedan a parte de sus datos. Pero a pesar de que estos sitios proporcionan API, todavía existen algunos campos de datos que no pudimos scrape o no tenemos autenticación para acceder.
Por ejemplo, Amazon proporciona una [API de publicidad de productos](https://docs.aws.amazon.com/AWSECommerceService/latest/DG/Welcome.html), pero la API en sí misma no podía proporcionar acceso a toda la información que se muestra en su página de productos para que la gente pueda scrape, como el precio, etc. En este caso, la única forma de scrape más datos, dicho campo de datos de precios, es construir nuestro propio scraper mediante programación o usar ciertos tipos de herramientas de scraper automatizadas.
**Es difícil scrape datos, incluso para programadores.**
A veces, incluso si sabemos cómo scrape los datos por nuestra cuenta mediante la programación, como usar Ruby o Python, aún no podríamos scrape los datos con éxito por varias razones. En la mayoría de los casos, es probable que tengamos prohibido raspar de ciertos sitios web debido a nuestras acciones de scraping repetitivas sospechosas en muy poco tiempo. Si es así, es posible que necesitemos utilizar un proxy de IP que automatice la salida de las IP sin ser rastreados por esos sitios objetivo.
Las posibles soluciones descritas anteriormente pueden requerir que las personas estén familiarizadas con las habilidades de codificación y el conocimiento técnico más avanzado. De lo contrario, podría ser una tarea difícil o imposible de completar.
**Para que los sitios web de scrape estén disponibles para la mayoría de las personas, me gustaría enumerar varias herramientas de scrape que pueden ayudarlo a raspar cualquier información comercial, incluidos precios, acciones, reseñas, etc., de manera estructurada con mayor eficiencia y velocidad mucho más rápida.**
## [Octoparse](https://www.octoparse.es/)
Puede usar esta herramienta de scrape para raspar muchos sitios web, como [Amazon](https://www.octoparse.es/tutorial-7/scrape-product-information-from-amazon), [eBay](https://www.octoparse.es/tutorial-7/scrape-pricing-from-ebay), [AliExpress](https://www.youtube.com/watch?v=WrfHQKKowT0&), Priceline, etc., para obtener datos que incluyen precios, comentarios, comentarios, etc. Los usuarios no necesitan saber cómo codificar para raspar datos, pero necesitan para aprender a configurar sus tareas.
La configuración de las tareas es fácil de entender, la interfaz de usuario es muy fácil de usar, como se puede ver en la imagen a continuación. Hay un panel de Workflow Designer donde puede apuntar y arrastrar los bloques visuales funcionales. Simula los comportamientos de navegación humana y raspa el los usuarios de datos estructurados necesitan. Con este raspador, puede usar la IP de proxy solo configurando ciertas Opciones avanzadas, que son muy eficientes y rápidas. Luego, puede raspar los datos, incluidos el precio, las revisiones, etc., según lo necesite después de completar la configuración.
&#x200B;
https://preview.redd.it/b5c1b1y6czy51.png?width=1920&format=png&auto=webp&v=enabled&s=2a09849db491b66e2cd7e6df8462f01b075f76b5
La extracción de cientos o más datos se puede completar en segundos. Puede scrape cualquier tipo de datos que desee, los marcos de datos se devolverán como en la figura a continuación, que incluye el precio y los resultados raspados de la evaluación del cliente.
Aviso: para todos los usuarios, hay dos ediciones de Octoparse Scraping Service: La edición gratuita y La edición de pago. Ambas ediciones proporcionarán las necesidades básicas de raspado para los usuarios, lo que significa que los usuarios pueden raspar datos y exportarlos a varios formatos, como CSV, Excel, HTML, TXT y bases de datos (MySQL, SQL Server y Oracle). Si bien, si desea obtener datos con una velocidad mucho más rápida, puede actualizar su cuenta gratuita a cualquier cuenta paga en la que esté disponible el Servicio de Cloud. Habrá al menos 4 servidores en la nube con Octoparse Cloud Service trabajando en su tarea simultáneamente. Aquí hay un video que presenta el servicio de nube de Octoparse.
Además, Octoparse también ofrece servicio de datos, lo que significa que puede expresar sus necesidades y requisitos de raspado y el equipo de soporte lo ayudará a raspar los datos que necesita.
## [Import.io](https://import.io/)
Import.io también se conoce como un web crawler que cubre todos los diferentes niveles de necesidades de rastreo. Ofrece una herramienta mágica que puede convertir un sitio en una tabulation sin ninguna sesión de entrenamiento. Sugiere a los usuarios descargar su aplicación de escritorio si es necesario rastrear sitios web más complicados.
Una vez que haya creado su API, ofrecen una serie de opciones de integración simples, como Google Sheets, Plot.ly, Excel, así como solicitudes GET y POST. También proporciona servidores proxy para evitar que los usuarios sean detectados por los website de destino, y puede scrape tantos datos como necesite. No es difícil usar esta herramienta que importe la interfaz de usuario, es bastante amigable de usar. Puede consultar sus tutoriales oficiales para aprender cómo configurar sus propias tareas de scraping. Cuando considera que todo esto viene con una etiqueta de precio de por vida y un increíble equipo de soporte, import.io es un primer puerto claro para aquellos que buscan datos estructurados. También ofrecen una opción paga de nivel empresarial para empresas que buscan una extracción de datos más compleja o a gran escala.
&#x200B;
https://preview.redd.it/khwhdhq7czy51.png?width=1238&format=png&auto=webp&v=enabled&s=48dbaed99fd6a19b0a31377caf7311e4a5f75476
[**ScrapeBox**](http://www.scrapebox.com/)
Los expertos en SEO, los vendedores en línea e incluso los spammers deberían estar muy familiarizados con ScrapeBox. Los usuarios pueden recolectar fácilmente datos de un website para recibir correos electrónicos, verificar el rango de la página, verificar los servidores proxy y el RSS submission. Mediante el uso de miles de servidores proxy rotativos, podrá escabullirse de las palabras clave del sitio de la competencia, investigar en sitios .gov, recopilar datos y comentar sin ser bloqueado o detectado.
&#x200B;
https://preview.redd.it/icb92o78czy51.png?width=884&format=png&auto=webp&v=enabled&s=a46845913d04b08a5fc8303370644d88556ac236
jtefsr
u_melisaxinyue
melisaxinyue
t3_jtefsr
https://www.reddit.com/r/u_melisaxinyue/comments/jtefsr/scraping_precios_descargar_datos_en_comercio/
11/13/2020 9:53:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping Precios: Descargar Datos en Comercio Electrónico Web
False
1
jtefsr
0
7400
5
5
4
0.391389432485323
3
0.293542074363992
0
0
567
55.4794520547945
1022
Red
10
Dash Dot Dot
20
No
891
Posted
8/16/2021 7:35:44 AM
Con el aumento del número de compradores en línea, los clientes se adaptan gradualmente al modelo de comercio electrónico y se vuelven más exigentes. El cambio en el comportamiento de compra ciertamente crea más oportunidades pero desafíos para los transportistas directos. Para la mayoría de los propietarios de negocios de envío directo, ahora la pregunta es, ¿cómo subir de nivel tu negocio para destacarse de la competencia y obtener continuamente más clientes?
Probablemente olfatees la nueva tendencia: [D2C](https://www.sana-commerce.com/e-commerce-terms/what-is-d2c-e-commerce/) (Direct-to-customers). Aunque el término ha estado flotando durante un par de años, pocas personas prestaron atención hasta que ocurrió la pandemia de COVID-19. Con la creciente demanda de comercio electrónico que elimina la barrera entre los vendedores y los consumidores, ahora D2C es el nuevo negro.
[Original Image](https://preview.redd.it/m2hjcrowaoh71.png?width=1600&format=png&auto=webp&v=enabled&s=457ae6ab71c22c8b3d5824f03cf6ff589d9ea0bd)
Ciertamente, hay algunas cosas que podemos tomar prestadas del modelo de negocio D2C y usarlo en el negocio de dropship.
“Según el análisis de [eMarketer](https://www.emarketer.com/content/why-more-brands-should-leverage-d2c-model) en febrero de 2021, las ventas de comercio electrónico de D2C en los EE. UU. Han crecido un 45,5% en 2020, generando alrededor de $ 111,54 mil millones y representando el 14% de las ventas totales de comercio electrónico minorista. Se espera que D2C mantenga un crecimiento relativamente constante cada año hasta 2023, momento en el que las ventas de comercio electrónico de D2C podrían haber alcanzado los $ 174,98 mil millones ".
**Tabla de contenidos**
[¿Qué es D2C y qué lo hace tan popular?](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h1)
[La diferencia entre D2C y dropshipping](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h2)
[Estrategia 1 de ](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)[Drop-ship: comenzar con los proveedores adecuados](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)
[Estrategia 2 de ](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)[Drop-ship: muestra tu marca con un escaparate en línea](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)
[Estrategia 3 de ](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)[Drop-ship: subir de nivel los servicios](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)
## ¿Qué es D2C y qué lo hace tan popular?
Bajo un modelo D2C o Direct-to-Customer, los propietarios de marcas pueden vender productos directamente a sus clientes en los sitios web oficiales de sus marcas, en lugar de depender de tiendas físicas, mercados de comercio electrónico o cualquier otra plataforma de intermediarios.
Esto ha trascendido el modelo de negocio tradicional B2C de muchas formas. Los propietarios de negocios que apliquen una estrategia D2C obtendrán más control sobre sus marcas sin que un minorista se interponga en el medio. Como resultado, pueden construir relaciones más estrechas con los clientes finales y ser más interactivos con la demanda del mercado.
## ¿En qué se diferencia D2C de Dropshipping?
No mezcla estos dos conceptos. Los dropshippers manejan los pedidos de los clientes directamente y luego envían los productos utilizando un proveedor externo. Como ya has notado, el costo del almacenamiento físico constituye el margen de oportunidad. Los dropshippers no tienen ninguna inversión inicial para mantener el inventario, PERO no es sostenible a largo plazo. Considerando que, D2C no suena como un enfoque esbelto, ya que no solo debes pagar por el almacenamiento, sino también por todo lo relacionado con la operación. Sin embargo, **los partidarios del modelo D2C saben que aquellos que se acercan a los clientes pueden mantenerse firmes hasta el final de la competencia.**
Si deseas construir un negocio de Dropshipping sostenible como el modelo D2C, no es difícil darte cuenta de que solo necesitas algunas cosas para salir adelante: **proveedores confiables, conocer a** **tus clientes y una solución de cumplimiento de pedidos nivelada. Así es como se ve una estrategia de envío directo en un nivel alto:**
## Comenzando con los proveedores adecuados:
Un proveedor confiable que vende productos que se ajustan al nicho de marketing podría ganarte más probabilidades en el juego. Cuando pensamos en proveedores, pensamos en Aliexpress u otras empresas locales que pueden ofrecer productos de calidad a un costo menor.
Con tantas ideas de productos para mirar y tanta información disponible en línea, buscar una buena opción puede ser más desafiante a través de la investigación de mercados. Así es como el web scraping viene a tu rescate.
[**El wep scraping**](https://www.octoparse.es/blog/introduccion-a-las-tecnicas-y-herramientas-de-web-scraping) **es la mejor práctica para recuperar datos web dispersos en un formato utilizable**. Puede recoger información de productos de múltiples fuentes en un formato organizado y estructurado. Eventualmente, los datos se pueden sincronizar con tu tienda en línea a través de una [integración de API](https://helpcenter.octoparse.es/hc/es/articles/1500000914301-Conectar-la-API-de-Octoparse-paso-a-paso), que conecta tus tiendas y proveedores en línea sin problemas. Puede proporcionarte información básica del producto que incluye: precios, nombre del producto, SKU, inventario y URL de imágenes.
**A continuación, recopila datos para comprender verdaderamente a tus clientes.**
Cada acción del cliente ofrece información valiosa sobre el comportamiento del cliente. Para determinar cómo reaccionan los clientes ante tus productos o tus competidores, puedes [monitorear los productos](http://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios) en todos los mercados. Esto te brinda más información sobre tu posición en el mercado. Herramientas como [**Octoparse**](https://www.octoparse.es/) pueden recopilar información como reseñas, revisores, calificaciones, niveles de existencias, etc.
Muestra tu marca con un escaparate en línea:
En lugar de depender de un mercado como Amazon, que tiene un millón de variedades de productos que compiten entre sí en cada línea de productos, tener una tienda en línea independiente te permite mostrar tus productos únicos. La mejor parte es que tienes control total sobre el tráfico y los clientes finales con los que puedes entablar relaciones y compartir una experiencia de marca personalizada.
El gigante de las maquinillas de afeitar Harry's optó por la estrategia de marketing exacta que lo ayudó a ganar más de $ 1 millón en ventas en el primer mes y rápidamente subió a $ 100 millones en 2 años. Andy Katz-Mayfield, cofundador de Harry's, se dio cuenta del hecho de que lo que los consumidores necesitan son productos simples pero efectivos que se sientan bien de usar. Este es un gran avance para iniciar su propio negocio de maquinillas de afeitar, que ofrece un modelo de venta simple: una gran maquinilla de afeitar entregada directamente a tu puerta.
¿Encuentra la diferencia? Harry's intenta ofrecer un viaje de compra simplificado pero personalizado, además de una historia de marca genuina que resuene entre los clientes. Esto hace que los clientes se sientan más conectados con la marca y alivia fácilmente sus preocupaciones sobre un nuevo negocio.
## Por último, subir de nivel los servicios de cumplimiento de pedidos y, por lo tanto, la experiencia del cliente.
Como dropshipper, probablemente te acostumbres a que los proveedores se ocupen de todo el proceso de cumplimiento del pedido y no le presten mucha atención. Pero si este trabajo de "última milla" se vuelve problemático, todo tu esfuerzo anterior para mejorar la experiencia del cliente se arruinará. Estado de pedido transparente, productos de calidad según lo prometido en los listados, paquetes con visibilidad de ma...
p5bcke
u_melisaxinyue
melisaxinyue
t3_p5bcke
https://www.reddit.com/r/u_melisaxinyue/comments/p5bcke/cómo_pueden_los_dropshippers_aprender_del_negocio/
8/16/2021 7:35:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
¿Cómo pueden los dropshippers aprender del negocio D2C?
False
1
p5bcke
0
7400
5
5
3
0.233100233100233
0
0
0
0
690
53.6130536130536
1287
Red
10
Dash Dot Dot
20
No
890
Posted
11/13/2020 9:43:35 AM
**Manejar AJAX y JavaScript**
Hablando sobre el manejo de AJAX y JavaScript mientras se raspa la web, a veces puede ser complicado, especialmente cuando eres un novato en tecnología.
Últimamente he recibido muchas preguntas sobre cómo scrape AJAX y JavaScript. He recopilado algunas de las preguntas más frecuentes de los clientes.
. ¿Cómo scrape un sitio web AJAX de desplazamiento infinito?
. ¿Cómo scrape datos y hago clic en el botón cargar o en el botón Siguiente?
. ¿Cómo scrape los sitios web con contenido AJAX (como Gumtree)?
. ¿Se puede usar Octoparse para raspar contenido dinámico de sitios web que usan AJAX?
. ¿Puedo scrape datos del sitio web con paginación?
. ¿Puedo scrape sitios web que cargan datos dinámicamente (como Facebook)?
. ¿Puedo rastrear un sitio web que carga contenido usando Javascript?
......
[*Lidiando con el desplazamiento infinito/cargar más*](https://www.octoparse.es/tutorial-7/infinite-scrolling-and-load-more)
[*Tratar con AJAX*](https://www.octoparse.es/tutorial-7/ajax)
[*Extrección Incremental:Obtenga datos actualizados fácilmente*](https://www.octoparse.es/tutorial-7/obtenga-datos-actualizados-f%C3%A1cilmente)
[*¿Cómo manejar la paginación con números de página?*](https://www.octoparse.es/tutorial-7/paginacion-con-numeros-de-pagina)
[*Autodetección AJAX*](https://www.octoparse.es/tutorial-7/autodeteccion-ajax)
**Scraping de Páginas Web con AJAX No es Fácil**
A veces las personas ven páginas web y encuentran que el contenido de AJAX se está cargando en la web pero piensan que el sitio no puede ser scraped. Si está aprendiendo Python y está sumergiendo su mano en la construcción de un raspador web. No va a ser muy fácil. Si está buscando una manera fácil y rápida de hacer esto, especialmente para grandes cargas de trabajo, es posible que desee buscar aplicaciones de terceros para extraer datos de páginas web con AJAX.
**Ejemplo: Scrape Websites con Desplazamiento Infinito**
Entonces, como ejemplo, lo que voy a mostrar es cómo scrape sitios web con desplazamiento infinito. (Si eres un programador experimentado y escribes tu asombroso herramientas de raspado web, simplemente ignora mi galimatías).
[Vea aquí cómo manejarc y scrape los websites de desplazamiento infinito.](https://www.octoparse.es/tutorial-7/infinite-scrolling-and-load-more)
jtec3g
u_melisaxinyue
melisaxinyue
t3_jtec3g
https://www.reddit.com/r/u_melisaxinyue/comments/jtec3g/web_scraping_scraping_de_ajax_y_javascript/
11/13/2020 9:43:35 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping - Scraping de AJAX y JavaScript Websites
False
1
jtec3g
0
7400
5
5
0
0
0
0
0
0
211
59.2696629213483
356
Red
10
Dash Dot Dot
20
No
889
Posted
9/6/2021 7:11:43 AM
**Tabla de Contenidos**
1. [¿Por Qué Utilizar El Servidor Proxy Para El Web Scraping?](https://www.octoparse.es/blog/utilizar-el-servidor-proxy-para-web-scraping#div1)
2. [La Fiabilidad Del Proxy](https://www.octoparse.es/blog/utilizar-el-servidor-proxy-para-web-scraping#div2)
3. [Web Scraping En La Nube](https://www.octoparse.es/blog/utilizar-el-servidor-proxy-para-web-scraping#div3)
4. [Web Scrapers Populares Para Evitar El Bloqueo De IP](https://www.octoparse.es/blog/utilizar-el-servidor-proxy-para-web-scraping#div4)
* Octoparse
* Import.io
* Webhose.io
* Screen Scraper
https://preview.redd.it/v57kcpwr1ul71.png?width=1600&format=png&auto=webp&v=enabled&s=f5e00de70e67d15e856daf21c7960f00a1f5b959
## ¿Por Qué Utilizar El Servidor Proxy Para El Web Scraping?
Web Scraper o spider se vuelve cada vez más popular en la ciencia de datos. Esta técnica automática puede ayudarnos a recuperar una gran cantidad de datos personalizados de la Web o de la base de datos. Sin embargo, el problema principal es que el sitio web puede rastrear fácilmente la solicitud de demasiadas páginas en un período de tiempo demasiado corto mediante una única dirección IP, por lo que el sitio web de destino puede bloquearlo. Para limitar las posibilidades de ser bloqueado, debemos intentar evitar raspar un sitio web con una única dirección IP. Y normalmente, utilizamos servidores proxy que incluyen direcciones IP de proxy discretas siempre que las solicitudes se enrutan a través del servidor de rastreo.
## La Fiabilidad Del Proxy
Preocupados por el servidor proxy, la fiabilidad del proxy siempre debe ser lo primero en nuestra mente. En realidad, hay alrededor de 1000 lugares para comprar proxies y algunos proxies poco confiables irían demasiado rápido, lo que podría causar que se bloqueen. También hay otros enfoques que pueden estar más relacionados con la subcontratación de la rotación de IP (piensa en el proxy como un servicio), pero estos servicios generalmente tienen un costo más alto. Dado que existe un costo de comprar el proxy y el costo de volver a implementar el proxy cada vez que compra uno nuevo. Con mucha frecuencia, la confiabilidad tiene un costo y, a menudo, encontrará que "gratis" será muy poco confiable, "barato" será algo poco confiable y "más costoso" generalmente tendrá un costo adicional. Por lo tanto, recientemente se ha propuesto el concepto de extracción de datos basada en la nube.
## Web Scraping En La Nube
Web Scraping basado en la nube es un verdadero servicio basado en la nube, puede ejecutarse desde cualquier sistema operativo y cualquier navegador. No tenemos que alojar nada nosotros mismos y todo se hace en la nube. Además, todas las visitas a la página del sitio web, la formación de datos y la transformación se pueden manejar en el servidor de otra persona. Los requisitos de proxy web pueden ser gestionados por nosotros mismos.
En el lado de la nube, estas máquinas son independientes, se puede acceder a ellas y ejecutarlas sin necesidad de instalarlas desde cualquier PC con acceso a Internet en todo el mundo. Este servicio administrará nuestros datos con un increíble hardware de back-end, más específicamente, podemos utilizar su función de proxy anónimo que podría rotar toneladas de direcciones IP para evitar ser bloqueadas por el sitio web de destino.
## Web Scrapers Populares Para Evitar El Bloqueo De IP
En realidad, podemos adoptar un enfoque más conciso y eficiente mediante el uso de cierta herramienta Data Scraper con servicios basados en la nube, como [Octoparse](https://www.octoparse.es/), [Import.io](https://www.import.io/). Estas herramientas pueden programar y ejecutar tu tarea en cualquier momento en el lado de la nube con toneladas de PC ejecutándose en el Mismo tiempo. Además, estas herramientas de raspador también pueden proporcionarnos una forma rápida de configurar manualmente estos servidores proxy según lo necesites. Aquí hay un tutorial que presenta cómo [configurar proxies](http://www.octoparse.es/tutorial-7/set-up-proxies) en Octoparse.
Algunas herramientas de raspador populares en el mercado incluyen Octoparse, Import.io, Webhose.io, Screen Scraper.
### 1. [Octoparse](https://www.octoparse.es/)
https://preview.redd.it/oskalw3w1ul71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=5f51ff5cdaf27c877b43219c64d3065c67ea574e
Octoparse es una herramienta de rastreo de datos poderosa y gratuita que puede rastrear casi todos los sitios web. Su extracción de datos basada en la nube puede proporcionar servidores proxy de dirección IP rotativos ricos para web scraping, lo que ha limitado las posibilidades de ser bloqueado y ahorrado mucho tiempo para la configuración manual. Han proporcionado instrucciones precisas y pautas claras para seguir los pasos de raspado. Básicamente, para esta herramienta, no es necesario tener habilidades de codificación. De todos modos, si deseas profundizar y fortalecer tu rastreo y raspado, ha ofrecido una [API pública](http://dataapi.octoparse.com/help) si lo necesitas. Además, su soporte de respaldo es eficiente y está disponible.
### 2. [Import.io](https://www.import.io/)
Import.io también es un raspador de datos de escritorio fácil de usar. Tiene una interfaz de usuario sucinta y eficaz y una navegación sencilla. Para esta herramienta, también requiere menos habilidades de codificación. Import.io también posee muchas características poderosas, como el servicio basado en la nube que puede ayudarnos a cuidar mejor de nuestra tarea programada y mejorar nuestra capacidad de minería para su dirección IP rotativa. Sin embargo, Improt.io tiene dificultades para navegar a través de combinaciones de javascript / POST.
### 3. [Webhose.io](https://webhose.io/)
Webhose.io es una herramienta de rastreo de datos basada en navegador que utiliza varias técnicas de rastreo de datos para rastrear cantidades de datos de múltiples canales. Si bien puede que no se comporte tan bien como las herramientas introducidas anteriormente sobre su servicio en la nube, lo que significa que el proceso de raspado relacionado con la rotación de IP o la configuración del proxy puede ser algo complejo. Han proporcionado un plan de servicio gratuito y de pago según lo necesites.
### 4. [Screen Scraper](https://www.screen-scraper.com/)
Screen Scraper es bastante ordenado y puede lidiar con ciertas tareas difíciles, incluida la localización precisa, la navegación y la extracción de datos, sin embargo, requiere que tengas habilidades básicas de programación / tokenización si deseas que funcione al máximo. Implica que debes configurar los ajustes y establecer los parámetros manualmente la mayor parte del tiempo, las ventajas de que puede personalizar tu proceso de minería distintivo, mientras que las desventajas son que requiere un poco de tiempo y es complejo. Además, es un poco caro.
piugxt
u_melisaxinyue
melisaxinyue
t3_piugxt
https://www.reddit.com/r/u_melisaxinyue/comments/piugxt/web_scraping_utilizar_el_servidor_proxy_para_web/
9/6/2021 7:11:43 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping | Utilizar el servidor proxy para Web Scraping
False
1
piugxt
0
7400
5
5
Red
10
Dash Dot Dot
20
No
888
Posted
8/30/2021 8:28:54 AM
**Herramienta Web Scraping** (también conocido como extracción de datos de la web, web crawling) se ha aplicado ampliamente en muchos campos hoy en día. Antes de que una herramienta de scraping llegue al público, es la palabra mágica para personas normales sin habilidades de programación. Su alto umbral sigue bloqueando a las personas fuera de Big Data. **Una herramienta de web scraping es la tecnología de captura automatizada y cierra la brecha entre Big Data y cada persona.**
Enumeré **20 MEJORES web scrapers** incluyen sus caracterísiticas y público objetivo para que tomes como referencia. ¡Bienvenido a aprovecharlo al máximo!
**Tabla de Contenidos**
**¿Cuáles son los beneficios de usar técnicas de web scraping?**
**20 MEJORES web scrapers**
* [**Octoparse**](http://octoparse.es/)
* [**Cyotek WebCopy**](https://www.cyotek.com/cyotek-webcopy)
* [**HTTrack**](https://www.httrack.com/)
* [**Getleft**](https://sourceforge.net/projects/getleftdown/)
* [**Scraper**](https://chrome.google.com/webstore/detail/scraper/mbigbapnjcgaffohmbkdlecaccepngjd)
* [**OutWit Hub**](https://addons.mozilla.org/en-US/firefox/addon/outwit-hub/)
* [**ParseHub**](https://www.parsehub.com/)
* [**Visual Scraper**](http://visualscraper.blogspot.hk/)
* [**Scrapinghub**](https://scrapinghub.com/)
* [**Dexi.io**](https://dexi.io/)
* [**Webhose.io**](https://webhose.io/)
* [**Import. io**](https://www.import.io/)
* [**80legs**](http://80legs.com/)
* [**Spinn3r**](https://www.spinn3r.com/)
* [**Content Grabber**](https://contentgrabber.com/)
* [**Helium Scraper**](http://www.heliumscraper.com/en/index.php?p=home)
* [**UiPath**](http://www.uipath.com/)
* [**Scrape.it**](https://www.npmjs.com/package/scrape-it)
* [**WebHarvy**](https://www.webharvy.com/)
* [**ProWebScraper**](https://prowebscraper.com/)
**Conclusión**
**¿Cuáles son los beneficios de usar** **técnicas de web scraping****?**
* Liberar tus manos de hacer trabajos repetitivos de copiar y pegar.
* Colocar los datos extraídos en un formato bien estructurado que incluye, entre otros, Excel, HTML y CSV.
* Ahorrarte tiempo y dinero al obtener un analista de datos profesional.
* Es la cura para comercializador, vendedores, periodistas, YouTubers, investigadores y muchos otros que carecen de habilidades técnicas.
**1.** **Octoparse**
**Octoparse** es un web scraper para extraer casi todo tipo de datos que necesitas en los sitios web. Puedes usar Octoparse para extraer datos de la web con sus amplias funcionalidades y capacidades. Tiene dos tipos de modo de operación: [**Modo Plantilla de tarea**](https://helpcenter.octoparse.es/hc/es/articles/360039675314-Empieza-a-usar-Easy-Template-una-soluci%C3%B3n-de-web-scraping-para-principiantes) y **Modo** **Avanzado**, para que los que no son programadores puedan aprender rápidamente. La interfaz fácil de apuntar y hacer clic puede guiarte a través de todo el proceso de extracción. Como resultado, puedes extraer fácilmente el contenido del sitio web y guardarlo en formatos estructurados como EXCEL, TXT, HTML o sus bases de datos en un corto período de tiempo.
Además, proporciona una **Programada Cloud Extracción** que tle permite extraer datos dinámicos en tiempo real y mantener un registro de seguimiento de las actualizaciones del sitio web.
También puedes extraer la web complejos con estructuras difíciles mediante el uso de su configuración incorporada de Regex y XPath para localizar elementos con precisión. Ya no tienes que preocuparte por el bloqueo de IP. Octoparse ofrece Servidores Proxy IP que automatizarán las IP y se irán sin ser detectados por sitios web agresivos.
Octoparse debería poder satisfacer las necesidades de rastreo de los usuarios, tanto básicas como avanzadas, sin ninguna habilidad de codificación.
&#x200B;
**2.** **Cyotek WebCopy**
WebCopy es un web crawler gratuito que te permite copiar sitios parciales o completos localmente web en tu disco duro para referencia sin conexión.
Puedes cambiar su configuración para decirle al bot cómo deseas capturar. Además de eso, también puedes **configurar alias de dominio, cadenas de agente de usuario, documentos predeterminados** y más.
Sin embargo, WebCopy no incluye un DOM virtual ni ninguna forma de análisis de JavaScript. Si un sitio web hace un uso intensivo de JavaScript para operar, es más probable que WebCopy no pueda hacer una copia verdadera. Es probable que no maneje correctamente los diseños dinámicos del sitio web debido al uso intensivo de JavaScript
&#x200B;
**3.** **HTTrack**
Como programa gratuito de rastreo de sitios web, HTTrack **proporciona funciones muy adecuadas para descargar un sitio web completo a su PC**. Tiene versiones disponibles para Windows, Linux, Sun Solaris y otros sistemas Unix, que cubren a la mayoría de los usuarios. Es interesante que HTTrack pueda reflejar un sitio, o más de un sitio juntos (con enlaces compartidos). Puedes decidir la cantidad de conexiones que se abrirán simultáneamente mientras descarga las páginas web en "establecer opciones". Puedes obtener las fotos, los archivos, el código HTML de su sitio web duplicado y reanudar las descargas interrumpidas.
Además, el soporte de proxy está disponible dentro de **HTTrack para maximizar la velocidad.**
HTTrack funciona como un programa de línea de comandos, o para uso privado (captura) o profesional (espejo web en línea). Dicho esto, HTTrack debería ser preferido por personas con **habilidades avanzadas de programación**.
&#x200B;
**4**. **Getleft**
Getleft es un web spider gratuito y fácil de usar. Te permite **descargar un sitio web completo** o cualquier página web individual. Después de iniciar Getleft, puedes ingresar una URL y elegir los archivos que deseas descargar antes de que comience. Mientras avanza, cambia todos los enlaces para la navegación local. Además, ofrece soporte multilingüe. ¡Ahora Getleft admite 14 idiomas! Sin embargo, solo proporciona compatibilidad limitada con Ftp, descargará los archivos pero no de forma recursiva.
En general, Getleft debería poder satisfacer **las necesidades básicas de scraping** de los usuarios **sin requerir habilidades más sofisticadas**.
&#x200B;
**5**. **Scraper**
Scraper es una extensión de Chrome con funciones de extracción de datos limitadas, pero es útil para realizar investigaciones en línea. También permite **exportar los datos a las hojas de cálculo de Google**. Puedes copiar fácilmente los datos al portapapeles o almacenarlos en las hojas de cálculo con OAuth. Scraper puede generar XPaths automáticamente para definir URL para scraping.
No ofrece servicios de scraping todo incluido, pero puede satisfacer las necesidades de extracción de datos de la mayoría de las personas.
&#x200B;
**6**. **OutWit Hub**
OutWit Hub es un complemento de Firefox con docenas de funciones de extracción de datos para simplificar sus búsquedas en la web. Esta herramienta de web scraping puede navegar por las páginas y almacenar la información extraída en un formato adecuado.
OutWit Hub ofrece **una interfaz única para extraer pequeñas o grandes cantidades de datos por necesidad**. OutWit Hub te permite eliminar cualquier página web del navegador. Incluso puedes crear agentes automáticos para extraer datos.
Es una de las herramientas de web scraping más simples, de uso gratuito y te ofrece la comodidad de extraer datos web sin escribir código.
&#x200B;
**7.** **ParseHub**
Parsehub es un excelente web scraper que admite la recopilación de datos de la web que utilizan tecnología **AJAX, JavaScript, cookies**, etc. Sutecnología de aprendizaje automático puede leer, analizar y luego transformar documentos web en datos relevantes.
La aplicación de escritorio de Parsehub es compatible con sistemas como Windows, Mac OS X y Linux. Incluso puedes usar la aplicación web que está incorporado en el navegador.
Como programa gratuito, no puedes configurar más de cinco proyectos públicos en Parsehub. Los planes de suscripción pagados te permiten crear al menos 20 proyectos privados para scrape sitios web.
ParseHub está dirigido a prácticamente cualquier persona que desee jugar con los datos. Puede ser cualquier persona, desde anali...
pedxns
u_melisaxinyue
melisaxinyue
t3_pedxns
https://www.reddit.com/r/u_melisaxinyue/comments/pedxns/las_20_mejores_herramientas_de_web_scraping_para/
8/30/2021 8:28:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Las 20 Mejores Herramientas de Web Scraping para 2021
False
1
pedxns
0
7400
5
5
10
0.829875518672199
2
0.16597510373444
0
0
707
58.6721991701245
1205
Red
10
Dash Dot Dot
20
No
887
Posted
6/9/2020 8:53:09 AM
**1.** [**El web scraping es ilegal**](https://www.octoparse.es/blog/el-web-scraping-es-legal-en-algunos-paises)
Muchas personas tienen falsas impresiones sobre el web scraping. Es porque hay personas que no respetan el gran trabajo en Internet y usan web scraping herramienta robando el contenido. El web scraping no es ilegal en sí mismo, sin embargo, el problema surge cuando las personas lo usan sin el permiso del propietario del sitio y sin tener en cuenta los Términos de Servicio (Términos de Servicio). Según el informe, el 2% de los ingresos en línea se pueden perder debido al mal uso del contenido a través del raspado web. Aunque el raspado web no tiene una ley clara y términos para abordar su aplicación, está abarcado por las regulaciones legales. Por ejemplo:
* [Violación de la Ley de Abuso y Fraude Informático (CFAA)](https://www.nacdl.org/cfaa/?source=post_page---------------------------)
* [Violación de la Digital Millennium Copyright Act (DMCA)](https://en.wikipedia.org/wiki/Digital_Millennium_Copyright_Act?source=post_page---------------------------)
* [Trespass to Chattel](http://cyberlaw.stanford.edu/blog/2016/02/digital-trespass-what-it-and-why-you-should-care?source=post_page---------------------------)
* [Misappropriation](https://definitions.uslegal.com/m/misappropriation-law/?source=post_page---------------------------)
* [Infracción de copyright](https://www.lawfirms.com/resources/technology-law/technology-and-intellectual-property/copyright-internet.htm)
* [Incumplimiento de contrato](https://smallbusiness.findlaw.com/business-contracts-forms/breach-of-contract-and-lawsuits.html)
**2. El web scraping y el web crawling son lo mismo**
El web scraping implica la extracción de datos específicos en una página web específica, por ejemplo, extraer datos sobre clientes potenciales de ventas, listados de bienes inmuebles y precios de productos. Por el contrario, el web crawling es lo que hacen los motores de búsqueda. Escanea e indexa todo el sitio web junto con sus enlaces internos. "Crawler" puede navegar por la web sin un objetivo específico.
**3. Puedes scrape cualquier sitio web**
A menudo que las personas solicitan scraping cosas como direcciones de correo electrónico, publicaciones de Facebook o información de LinkedIn. Según un artículo titulado "¿Es legal el web scraping?" Es importante tener en cuenta las reglas antes de realizar el web scraping:
* Los datos privados que requieren nombre de usuario y códigos de acceso no se pueden scraped.
* Cumplimiento de los ToS (Términos de Servicio) que prohíbe explícitamente la acción de web scraping.
* No copie datos con derechos de autor.
Una persona puede ser procesada bajo varias leyes. Por ejemplo, uno raspó cierta información confidencial y la vendió a un tercero, ignorando la carta de prohibición enviada por el propietario del sitio. Esta persona puede ser procesada bajo la ley de Trespass a Chattel, Violación de Digital Millennium Copyright Act (DMCA), Violación de la Ley de Computer Fraud and Abuse Act (CFAA) and Misappropriation
No significa que no pueda scrape canales de redes sociales como Twitter, Facebook, Instagram y YouTube. Son amigables con los servicios de scraping que siguen las disposiciones del archivo robots.txt. Para Facebook, debe [obtener su permiso por escrito ](https://www.facebook.com/apps/site_scraping_tos_terms.php)antes de realizar el comportamiento de la recopilación automatizada de datos.
**4. Necesitas saber cómo codificar**
[Una herramienta de web scraping (herramienta de extracción de datos)](https://www.octoparse.es/) es muy útil para profesionales no tecnológicos como especialistas en marketing, estadísticos, consultores financieros, inversores de bitcoin, investigadores, periodistas, etc. Octoparse lanzó una característica única: [web scraping templates](https://www.octoparse.com/blog/big-announcement-web-scraping-template-take-away) que scrapers preformateados que cubren más de 14 categorías en más de 30 sitios web, incluidos Facebook, Twitter, Amazon, eBay, Instagram y más. Todo lo que tiene que hacer es ingresar las palabras clave/URL en el parámetro sin ninguna configuración de tarea compleja. El web scraping con Python lleva mucho tiempo. Por otro lado, una plantilla de web scraping es eficiente y conveniente para capturar los datos que necesita.
**5. Puede usar datos scraped para cualquier cosa**
Es perfectamente legal si extrae datos de sitios web para consumo público y los utiliza para análisis. Sin embargo, no es legal si scrape información confidencial con fines de lucro. Por ejemplo, scraping información de contacto privada sin permiso y venderla a un tercero para obtener ganancias es ilegal. Además, reempaquetar contenido raspado como propio sin citar la fuente tampoco es ético. Debe seguir de reglas sobre no enviar spam o cualquier uso fraudulento de datos está prohibido de acuerdo con la ley.
**6. Un web scraper es versátil**
Tal vez ha experimentado sitios web particulares que cambian su diseño o estructura de vez en cuando. No se frustre cuando se encuentre con sitios web que su scraper no puede leer por segunda vez. Hay muchas razones. No se activa necesariamente al identificarte como un bot sospechoso. También puede ser causado por diferentes ubicaciones geográficas o acceso de la máquina. En estos casos, es normal que un web scraper no pueda analizar el sitio web antes de establecer el ajuste.
**7. Puedes scraping web a alta velocidad**
Es posible que haya visto anuncios de scraper que dicen cuán rápidos son sus scrapers. Suena bien ya que le dicen que pueden recopilar datos en segundos. Sin embargo, si causas daños a la empresa, serás un delincuente y será procesado. Esto se debe a que una solicitud de datos escalables a una velocidad rápida sobrecargará un servidor web, lo que podría provocar un bloqueo del servidor. En este caso, la persona es responsable por el daño bajo la ley de "trespass to chattels" (Dryer y Stockton 2013). Si no está seguro de si el sitio web es scrapable o no, pregúntele al proveedor de servicios de desguace web. [Octoparse](https://www.octoparse.es/) es un proveedor de servicios de raspado web responsable que coloca la satisfacción de los clientes en primer lugar. Para Octoparse es crucial ayudar a nuestros clientes a resolver el problema y tener éxito.
**8.** [**API y Web scraping son lo mismo**](https://www.octoparse.es/blog/web-scraping-api-para-extraccion-de-datos)
API es como un canal para enviar su solicitud de datos a un servidor web y obtener los datos deseados. API devolverá los datos en formato JSON a través del protocolo HTTP. Por ejemplo, Facebook API, Twitter API, y Instagram API. Sin embargo, no significa que pueda obtener los datos que solicite. El web scraping puede visualizar el proceso ya que le permite interactuar con los sitios web. Octoparse tiene plantillas de web scraping. Es aún más conveniente para los profesionales no tecnológicos extraer datos al completar los parámetros con palabras clave/URL.
**9. The scraped data only works for our business after being cleaned and analyzed**
Many data integration platforms can help [visualize and analyze the data](http://www.dataextraction.io/?p=327). In comparison, it looks like data scraping doesn’t have a direct impact on business decision making. Web scraping indeed extracts raw data of the webpage that needs to be processed to gain insights like sentiment analysis. However, some raw data can be extremely valuable in the hands of gold miners.
**9. Los scraped data solo funcionan para nuestro negocio después de ser limpiados y analizados**
Muchas plataformas de integración de datos pueden ayudar a visualizar y analizar los datos. En comparación, parece que el scraping de datos no tiene un impacto directo en la toma de decisiones comerciales. De hecho, el web scraping extrae datos sin procesar de la página web que deben procesarse para obtener información como el análisis de sentimientos. Sin embargo, algunos datos en bruto pueden ser extremadamente valiosos en manos de los mineros de oro.
Con la plantilla de web scraping de Octoparse Google Search para buscar un resultado de búsqueda orgánica, puede ext...
gzjbkl
webscraping
melisaxinyue
t3_gzjbkl
https://www.reddit.com/r/webscraping/comments/gzjbkl/10_malentendidos_sobre_el_web_scraping/
6/9/2020 8:53:09 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10 Malentendidos sobre El Web Scraping
False
0.5
gzjbkl
0
7400
5
5
4
0.3125
4
0.3125
0
0
719
56.171875
1280
Red
10
Dash Dot Dot
20
No
886
Posted
9/14/2021 8:15:14 AM
Las reglas de viaje están cambiando actualmente con la curva de casos de Covid. Con la variante Delta de la enfermedad, los casos están aumentando. Mientras estoy compilando este artículo, la UE está considerando volver a imponer restricciones de viaje a los visitantes estadounidenses.
De todos modos, he creado mi raspador de Tripadvisor con Octoparse y he analizado la información de los destinos que están abiertos a los ciudadanos estadounidenses. Prepárate siempre para un viaje refrescante.
Nota: si te diriges a [estos países](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html), es posible que desees comprobar si es necesaria la vacunación o la cuarentena.
Por cierto, el web scraping es definitivamente la mejor manera de ayudarnos a extraer los datos web y así poder examinarlos y sacar el máximo provecho de ellos. Mostraré cómo me ayuda a obtener los datos de viaje.
&#x200B;
https://preview.redd.it/u7o5uvo6gfn71.jpg?width=698&format=pjpg&auto=webp&v=enabled&s=3cd02312f1643c5178399c4906241094ef8c2b27
Mapa geográfico generado por [mapchart.net](https://mapchart.net/)
## Tabla de Contenido
* Web Scraping de datos de viajes
* ¿A dónde puede ir un estadounidense?
* Crear un raspador de TripAdvisor
## Web Scraping de Datos de Viajes
¿Tienes alguna idea sobre [el big data en el turismo?](https://www.octoparse.es/blog/big-data-en-turismo)
Los empresarios de la industria de viajes están rastreando todo tipo de datos, por ejemplo, datos comerciales de agentes de viajes y datos de comportamiento de los visitantes en todas las plataformas relacionadas con viajes. Es posible que conozcan sus hábitos de viaje mejor que tú. Toda la industria está aprovechando el big data para lanzar el producto adecuado y encontrar a las personas adecuadas para pagar por sus servicios.
El web scraping es la tecnología que lo hace posible.
Bueno, como viajero, quiero recopilar datos de viajes en la web para satisfacer mis necesidades: encontrar destinos entre los más atractivos y obtener las guías de Tripadvisor para mi referencia.
**Que voy a hacer**
* En primer lugar, necesito una lista de países para investigar.
* En segundo lugar, utilizaré una herramienta de raspado web, Octoparse, para crear un raspador de Tripadvisor y rastrear los datos de viajes de estos países.
* ¡Finalmente, voy a empacar mi equipaje y dirigirme al destino que más se ajuste a mis gustos de viaje!
## ¿A Dónde Puede Ir un Estadounidense?
Entonces, ¿a dónde puede viajar un estadounidense ahora?
[Este artículo de CNN](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html) enumeró los destinos que están abiertos a los EE. UU. (La lista podría actualizarse de vez en cuando).
Lo que quería hacer era extraer todos los nombres de países de esta página web en una hoja de cálculo para poder pegarlos en Octoparse y obtener datos más específicos de Tripadvisor.
&#x200B;
https://preview.redd.it/ebjsm0zbgfn71.jpg?width=699&format=pjpg&auto=webp&v=enabled&s=2a80334cda9a587d961724848ca811e42c3bd3ec
Octoparse: cómo obtener información de la lista en una página web en Excel
Octoparse puede obtener fácilmente información de la lista en una página web en Excel o CSV.
Esto es extremadamente útil cuando deseas obtener una lista de URL o una lista de datos, que deseas pegar y buscar en otra plataforma, o importar a un software de análisis de datos para tu análisis.
Ahora que tengo la lista de destinos de texto, voy a crear un raspador de TripAdvisor para obtener datos específicos sobre estos lugares.
## Crear un Raspador de TripAdvisor
Los datos que voy a rastrear desde Tripadvisor:
* Quiero comprobar la popularidad de los viajes en estos países. Consultaré con el número de reseñas sobre el país en Tripadvisor. (Mi hipótesis: más visitas, más reseñas).
* Tengo mi tema de viaje. Soy un amante de la naturaleza interesado en eventos al aire libre y turismo en la naturaleza. Obtendré la información de la etiqueta de estos destinos para poder filtrar y ubicar el lugar perfecto donde pueda perseguir el viento, jugar en la playa o apreciar la grandeza de un pico.
* Guardaré la URL de las guías de viaje en Tripadvisor para una mayor planificación de viajes. (¡Gracias contribuidores!)
### Generar URL por Lotes con Nombres de Países
¿Dónde conseguir estos datos? Esta es una página de muestra: [Tripadvisor Nepal](https://www.tripadvisor.com/Search?q=Nepal&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1&rf=2).
Con la lista de nombres de países que he extraído en el paso anterior, puedo generar por lotes todas las páginas de países de Tripadvisor con Octoparse.
&#x200B;
https://preview.redd.it/pjoqa3mfgfn71.jpg?width=696&format=pjpg&auto=webp&v=enabled&s=b17463060051833c91d96fce7479253b70223499
Octoparse: generar URL por lotes con un parámetro
**Ejemplos de páginas generadas:**
[Tripadvisor Ireland](https://www.tripadvisor.com/Search?q=Ireland&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1)
[Tripadvisor Israel](https://www.tripadvisor.com/Search?q=Israel&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1)
[Tripadvisor Italy](https://www.tripadvisor.com/Search?q=Italy&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1)
[Tripadvisor Kenya](https://www.tripadvisor.com/Search?q=Kenya&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1)
Ahora que tengo una lista de páginas web de destino para extraer datos, voy a crear un raspador que comprenda qué datos estoy solicitando y los tomará por mí.
### Crear un Raspador: Dime Lo Que Quieres
Construir un raspador es como compilar una carta para conversar con la computadora: dígale dónde y cómo obtener los datos que deseas. Solo que no hablas en lenguaje humano, sino en lenguajes de programación.
Y una herramienta de raspado web es como un traductor. Te permite compilar la carta utilizando lenguaje humano, gracias al flujo de trabajo comprensible y la interfaz de usuario intuitiva.
Si esto sigue siendo abstracto, no importa. Vamos a sumergirnos en algunas preguntas.
**¿Qué puede hacer un raspador?**
* Visitar - abrir una página web.
* Hacer clic - hacer clic en un enlace de la página web.
* Extraer - rastrear datos como textos, URL, números, etc.
**¿Qué datos necesito?**
* El nombre del país, el número de reseñas.
* El enlace de la guía de viaje, el título de la guía y sus etiquetas.
**¿Cómo actuará un raspador para obtener los datos que necesito?**
* Visitará la pagina web
* Extraerá el nombre del país y el número de reseñas en la página
* Buscará el enlace de la guía de viaje y hará clic en él
* Extraerá la URL de la página, el título de la guía, las etiquetas de la guía
* Regresará y visitará la siguiente página web
* Repetirá los pasos anteriores (en Octoparse, esto se puede hacer con un [bucle](https://helpcenter.octoparse.es/hc/es/articles/360055946274-Elemento-de-bucle))
Bingo. Ese es el flujo de trabajo que construí aquí.
https://preview.redd.it/yfrwj07igfn71.jpg?width=640&format=pjpg&auto=webp&v=enabled&s=8a9b71806668e75c0c8cb11fe7266263dc45e24d
Octoparse: cómo funciona el flujo de trabajo de un raspador web
**¿Cómo construir el flujo de trabajo?**
Pan comido.
* Ingresar las URL en la barra de búsqueda y comenzar una tarea de construcción. (Díle al raspador qué páginas web visitar)
* Hacer clic en los datos que deseas en el navegador integrado. (Ayuda al raspador a localizar los datos)
* Seleccionar las acciones que deseas que realice el raspador en el Panel de sugerencias. (Díle al raspador que visite, ...
pnym31
webscraping
melisaxinyue
t3_pnym31
https://www.reddit.com/r/webscraping/comments/pnym31/tripadvisor_scraper_los_principales_destinos/
9/14/2021 8:15:14 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Tripadvisor Scraper: los principales destinos abiertos a los ciudadanos bajo Covid
False
0.17
pnym31
0
7400
5
5
2
0.169635284139101
1
0.0848176420695505
0
0
640
54.2832909245123
1179
Red
10
Dash Dot Dot
20
No
885
Posted
5/21/2020 9:19:18 AM
Desde el estallido del nuevo coronavirus contagioso en el aire, la vida de millones de personas se ha visto afectada y las noticias relevantes han explotado en todas las plataformas.
En esta situación, pensamos que sería necesario [recopilar datos en tiempo real](https://helpcenter.octoparse.com/hc/en-us/articles/360024336872-Incremental-Extraction-Get-updated-data-easily) de fuentes oficiales y no oficiales para que el público pueda tener una comprensión imparcial de este brote de epidemia con fuentes de datos transparentes.
[https://youtu.be/L2AxYDuMbk4](https://youtu.be/L2AxYDuMbk4)
Para obtener datos de estas fuentes, puede aprovechar las herramientas de web scraping como Octoparse, ya que hemos creado [web scraping templates](https://www.octoparse.com/blog/big-announcement-web-scraping-template-take-away) para extraer datos sobre el informe del gobierno de China. Esto puede mantenerlo actualizado con la información más reciente. Ahora echemos un vistazo a cómo usar la plantilla para extraer datos en vivo.
**Paso 1: Inicie Octoparse en su computadora y cree una tarea de scraping haciendo clic en "**[**Task Template**](http://www.octoparse.es/tutorial-7/empieze-usar-easy-template)**".**
&#x200B;
https://preview.redd.it/uocttfbt43051.png?width=497&format=png&auto=webp&v=enabled&s=a6e064109d25d4ae7b8da1e5afdc31d54dcc243f
Aviso: Hay un número de "recetas" de scraping que van desde sitios web de comercio electrónico hasta canales de redes sociales. Estos son rastreadores preformateados que se pueden usar para extraer datos de sitios web de destino directamente. Puede consultar este artículo para tener una mejor idea de [qué es una plantilla de web scraping](http://www.octoparse.es/tutorial-7/wizard-mode).
**Paso 2: En la categoría "Tiempo Real", elija "comisión nacional de salud".**
&#x200B;
https://preview.redd.it/2pcsnojv43051.png?width=473&format=png&auto=webp&v=enabled&s=408c172a4479649e93af5e1e6a66f81011887a5c
Verás dos plantillas. Una es para extraer [noticias y anuncios del gobierno](http://www.nhc.gov.cn/). El otro es el [website de noticias Tencent](https://news.qq.com/zt2020/page/feiyan.htm), que está directamente conectado con la Comisión de Salud central y local de China. Hasta ahora, este es el método más rápido para obtener datos en vivo, incluidos los casos confirmados, la recuperación, el número de muertos y la tasa de mortalidad en cada ciudad de China.
&#x200B;
https://preview.redd.it/25ots7gw43051.png?width=520&format=png&auto=webp&v=enabled&s=7dc131324355db99f559d9d0b042b297e8d5c7e6
**Paso 3: Haga clic en "datos en tiempo real 2019-nCov", ya que queremos recopilar datos en vivo.**
No hay necesidad de configuración. Simplemente inicie la extracción y Octoparse automáticamente raspará los datos a gusto. Puede exportar los datos a muchos formatos, como Excel, JSON, CSV, y a su propia base de datos a través de API. Así es como se ve la salida de datos en Excel.
&#x200B;
https://preview.redd.it/q4np4e3x43051.png?width=505&format=png&auto=webp&v=enabled&s=57dff485f290659786493069fe4a0cc3d486be11
**También puede extraer información en tiempo real en los canales de redes sociales. Hay plantillas que cubren plataformas populares como Facebook, Twitter, Instagram y YouTube.**
Por ejemplo, si desea extraer los últimos tweets sobre el virus y ver cómo reaccionan las personas, puede aprovechar la plantilla de "últimos tweets". Está diseñado para recopilar los últimos tweets que contienen la palabra clave de búsqueda que ingresó. Le permite extraer web page URL, tweet URL, los controladores, posts, etc.
https://preview.redd.it/oleu5aox43051.png?width=512&format=png&auto=webp&v=enabled&s=400983daf980c99d1e2c4e282f4c1f09b6fc5a3f
Ahora ejecutemos esta plantilla.
**Paso 1: Abra Twitter, escriba "coronavirus" y haga clic en la pestaña "más reciente"**.[Copie la URL y péguela en el primer parámetro](https://twitter.com/search?q=coronavirus&src=recent_search_click&f=live).
&#x200B;
https://preview.redd.it/yim7xhqy43051.png?width=487&format=png&auto=webp&v=enabled&s=d88a5a5ef86ebf65b6de31d327351583b13a85df
**Paso 2: Ingrese un número en el segundo parámetro.**
Twitter aplica una técnica de [desplazamiento infinito](https://www.octoparse.es/tutorial-7/infinite-scrolling-and-load-more), lo que significa que tenemos que establecer "scrolling number" hasta que obtengamos el número deseado de publicaciones. Puede establecer cualquier número que desee de 1 a 10,000. Esta idea es para cargar la página completamente. Por ejemplo, si ingresa el número 10, el bot se desplazará 10 veces.
**Paso 3: Ejecute el scraper haciendo clic en "save and run" y obtendrá los resultados al instante.**
https://preview.redd.it/s86or0iz43051.png?width=492&format=png&auto=webp&v=enabled&s=fee806d570851d3b94cb4a4e583657a4e1d41a99
En este video hemos cubierto cómo usar plantillas de web scraping para recopilar datos en tiempo real sobre el coronavirus. Si también desea construir su propio scraper para extraer artículos de portales de noticias como Wall Street Journal, New York Times y Reuters, puede ver este video.
gntkbt
CoronavirusRecession
melisaxinyue
t3_gntkbt
https://www.reddit.com/r/CoronavirusRecession/comments/gntkbt/web_scraping_cómo_obtener_coronavirus_covid19/
5/21/2020 9:19:18 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping: Cómo obtener Coronavirus (COVID-19) Datos
False
0.6
gntkbt
0
7400
5
5
4
0.593471810089021
1
0.148367952522255
0
0
365
54.1543026706231
674
Red
10
Dash Dot Dot
20
No
884
Commented
5/21/2020 9:26:18 AM
Echa un vistazo a la nueva serie de artículos Coronavirus: [Extracción y Visualización de Datos de Coronavirus](http://www.octoparse.es/blog/extraccion-y-visualizacion-de-datos-de-coronavirus)
frbs1qf
CoronavirusRecession
melisaxinyue
t1_frbs1qf
https://www.reddit.com/r/CoronavirusRecession/comments/gntkbt/web_scraping_cómo_obtener_coronavirus_covid19/frbs1qf/
5/21/2020 9:26:18 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gntkbt
t3_gntkbt
gntkbt
0
gntkbt
True
False
False
0
7400
5
5
0
0
0
0
0
0
16
55.1724137931034
29
Red
10
Dash Dot Dot
20
No
883
Posted
11/5/2020 4:00:02 AM
Los datos de las películas registran las preferencias del público y su actitud hacia determinadas cosas. Recopilar la información de la película de sitios web relacionados, como IMDb y Rotten Tomatoes, contribuirá al análisis de datos y a la data mining en la industria cinematográfica. En términos generales, los datos extraídos se pueden emplear en algún escenario:
* Analizar las características del público objetivo
* Obtener opiniones públicas para predecir las próximas tendencias.
* Ayudando a impulsar la Publicidad
Todavía hay más cosas que podemos hacer con los datos de la película según las necesidades. Para ayudarlo a completar [la recopilación de datos](http://www.octoparse.es/), este artículo presentará cómo extraer la información de la lista de películas de terror de IMDb, incluida la información del director, el elenco de actores y otra información importante.
En este caso, le mostraré cómo [**extraer la información**](http://www.octoparse.es/blog/top-visualizacion-herramienta) **de la película de terror 134,555 de IMDb**, usando el enlace:
[https://www.imdb.com/search/title/?genres=horror&start=51&explore=title\_type,genres&ref\_=adv\_nxt](https://www.imdb.com/search/title/?genres=horror&start=51&explore=title_type,genres&ref_=adv_nxt)
El objetivo de este web scraper es encontrar películas que figuran en la lista de películas de terror, obtener información del director, el elenco de actores y otra información importante.
Antes de comenzar, [descargue ](https://www.octoparse.es/download)Octoparse V7 en su computadora para realizar un seguimiento. Además, es muy recomendable aprender la [lógica básica del uso de Octoparse](https://helpcenter.octoparse.es/hc/es/articles/360052001473).
¡Empecemos!
**Paso 1: Abra el sitio web de destino en el navegador incorporado de** [Octoparse](http://www.octoparse.es/)**.**
Simplemente haga clic en "+ tarea" en el modo avanzado.
https://preview.redd.it/92rgxhnhhcx51.png?width=991&format=png&auto=webp&v=enabled&s=a4ee2673028d0ad0162122e3094a154fd70bc4ee
Luego, pegue la URL en el cuadro y haga clic en el botón "Save URL".
https://preview.redd.it/ptwm3wfihcx51.png?width=957&format=png&auto=webp&v=enabled&s=7f3342450ea255669360bc5308cba0934960ef4d
**Paso 2: Haga clic para crear una tarea para scrape la información de la película.**
Después de abrir el URL en el navegador incorporado de Octoparse, podemos continuar creando una paginación y un elemento de bucle para obtener los datos.
Simplemente haga clic en el elemento "siguiente>>" en el navegador integrado y luego haga clic en "Hacer clic en el elemento seleccionado en bucle" en Action Tips.
&#x200B;
https://preview.redd.it/b5qiw9jjhcx51.png?width=1266&format=png&auto=webp&v=enabled&s=e06d520691c127906b51bdcc43767fb5cf44e208
Podemos ver que la paginación se ha creado en el flujo de trabajo.
&#x200B;
https://preview.redd.it/76jd0k7khcx51.png?width=265&format=png&auto=webp&v=enabled&s=0cc3ddf175b50e54e5365bcc45b9b81322265886
Si desea que Octoparse reconozca el elemento que seleccionó con mayor precisión, simplemente puede [revisar XPath](https://helpcenter.octoparse.com/hc/en-us/articles/360041118892-What-is-XPath-and-how-to-use-it-in-Octoparse). Como podemos ver en la imagen de abajo, el XPath que generó Octoparse es //DIV\[@class='nav'\]/DIV\[2\]/A\[2\]. Será mejor que lo cambiemos a //a\[contains(text(), "Next »")\].
&#x200B;
https://preview.redd.it/fvzeua3lhcx51.png?width=1691&format=png&auto=webp&v=enabled&s=b0c45eb9e2c9cd89fa7b255dcc67213afef61732
En este caso, necesitamos[ extraer los datos](http://www.octoparse.es/) de la lista de películas, que dice, podemos crear directamente un elemento de bucle para extraer los datos.
Seleccione uno de los "bloques" en el navegador, Octoparse puede detectar todos los campos de datos en el blog que seleccionó.
&#x200B;
https://preview.redd.it/93aq02tlhcx51.png?width=1713&format=png&auto=webp&v=enabled&s=15118eca835b493fd5ceee5ec13a6ef51964b263
Luego, seleccione“ Seleccionar todos los subelementos”.
Octoparse estaba selecciona todos los datos necesarios y los resalta en rojo. Seleccione “Select All” para continuar.
&#x200B;
https://preview.redd.it/lwbvinrmhcx51.png?width=1724&format=png&auto=webp&v=enabled&s=9a0b7d49d43494e6a05c8e70e23c252c29198bd2
Finalmente, seleccionamos “Extraer datos en el bucle”.
https://preview.redd.it/aat4b8mnhcx51.png?width=1724&format=png&auto=webp&v=enabled&s=4fb21ddaff83d3e1addd806f65c344975eb182de
Ahora, tenemos tanto la paginación como el elemento de bucle hecho en Octoparse. Podemos ver el flujo de trabajo de la tarea en el lado izquierdo y los datos que se muestran en el lado derecho.
https://preview.redd.it/7y6drwkphcx51.png?width=1725&format=png&auto=webp&v=enabled&s=91662836e0c94c7da1f2f1fbc46ee924a441d7b4
**Paso 3: Limpia los datos en Octoparse.**
Antes de extraer datos, es mejor que limpiemos los datos para mejorar nuestro resultado final. Simplemente necesita hacer clic para eliminar el campo no deseado y cambiar el nombre de la descripción que necesita
**Paso 4: Extraer datos**
Simplemente haga clic en "Extraer datos" para obtener los datos localmente.
&#x200B;
https://preview.redd.it/n6fddjnqhcx51.png?width=1267&format=png&auto=webp&v=enabled&s=7c2f2fbd52eb733fa91651057cd6e17ded3b7d0d
Como la extracción local utiliza sus propios recursos informáticos, como la CPU, la velocidad de Internet, funciona más lento que el uso de la extracción en la nube Octoparse.
De todos modos, después de crear el scraper, lo que debe hacer es esperar y obtener los datos, más de 100,000 líneas de datos de películas en aproximadamente 2 horas.
&#x200B;
https://preview.redd.it/dep8ml8rhcx51.png?width=1728&format=png&auto=webp&v=enabled&s=84cd32f104b8514d3864fa408b9ba904f70b7d4b
Con los pasos anteriores, supongo, todos, incluidos aquellos que no tienen experiencia en programación, pueden construir fácilmente un crawler de películas con Octoparse V7 y obtener más de 100,000 líneas de información de la película. Sin embargo, esa no es la forma más sencilla. Usar Octoparse V8 podría ser mucho más fácil:
&#x200B;
https://preview.redd.it/2q9l5oxrhcx51.png?width=1682&format=png&auto=webp&v=enabled&s=836a5df1b93316cfe840ace789badd6d3578977e
En general, con el data scraping, podemos obtener datos de películas en línea sobre cualquier [tema legal](http://www.octoparse.es/tutorial-7/web-scraping-es-legal).
Aparte de los datos, lo más importante es la habilidad que aprendió, que es extremadamente útil para hacer la investigación de mercado, mantenerse actualizado y muchas otras cosas.
jobwtr
u_melisaxinyue
melisaxinyue
t3_jobwtr
https://www.reddit.com/r/u_melisaxinyue/comments/jobwtr/movie_crawler_scraping_más_de_100000_información/
11/5/2020 4:00:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Movie Crawler: Scraping más de 100,000 información de películas
False
1
jobwtr
0
7400
5
5
1
0.122549019607843
4
0.490196078431373
0
0
452
55.3921568627451
816
Red
10
Dash Dot Dot
20
No
882
Posted
9/1/2021 9:18:25 AM
El [**Web scraping**](https://octoparse.es/) (también denominado [**extracción** **datos de una web**](https://octoparse.es/download), web crawler, web scraper o web spider) es una web scraping técnica para extraer datos de una página web . Convierte datos no estructurados en datos estructurados que pueden almacenarse en su computadora local o en database.
Puede ser difícil crear un web scraping para personas que no saben nada sobre codificación. Afortunadamente, hay herramientas disponibles tanto para personas que tienen o no habilidades de programación. Aquí está nuestra lista de las 30 herramientas de web scraping más populares, desde bibliotecas de código abierto hasta extensiones de navegador y software de escritorio.
**Tabla de Contenido**
* Beautiful Soup
* Octoparse
* Import.io
* Mozenda
* Parsehub
* Crawlmonster
* Connotate
* Common Crawl
* Crawly
* Content Grabber
* Diffbot
* Dexi.io
* DataScraping.co
* Easy Web Extract
* FMiner
* Scrapy
* Helium Scraper
* Scrape.it
* Scrapinghub
* Screen-Scraper
* Salestools.io
* ScrapeHero
* UniPath
* Web Content Extractor
* WebHarvy
* Web Scraper.io
* Web Sundew
* Winautomation
* Web Robots
**1.** [**Beautiful Soup**](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
**Para quién sirve**: desarrolladores que dominan la programación para crear un web spider/web crawler.
**Por qué deberías usarlo**:Beautiful Soup es una biblioteca de Python de código abierto diseñada para scrape archivos HTML y XML. Son los principales analizadores de Python que se han utilizado ampliamente. Si tienes habilidades de programación, funciona mejor cuando combina esta biblioteca con Python.
Esta tabla resume las ventajas y desventajas de cada parser:-ParserUso estándarVentajasDesventajashtml.parser (puro)BeautifulSoup(markup, "html.parser")
* Pilas incluidas
* Velocidad decente
* Leniente (Python 2.7.3 y 3.2.)
No es tan rápido como lxml, es menos permisivo que html5lib.HTML (lxml)BeautifulSoup(markup, "lxml")
* Muy rápido
* Leniente
Dependencia externa de CXML (lxml)
BeautifulSoup(markup, "lxml-xml") BeautifulSoup(markup, "xml")
* Muy rápido
* El único parser XML actualmente soportado
Dependencia externa de Chtml5lib
BeautifulSoup(markup, "html5lib")
* Extremadamente indulgente
* Analizar las páginas de la misma manera que lo hace el navegador
* Crear HTML5 válido
* Demasiado lento
* Dependencia externa de Python
**2.** [**Octoparse**](https://octoparse.es/)
**Para quién sirve:** Las empresas o las personas tienen la necesidad de captura estos sitios web: comercio electrónico, inversión, criptomoneda, marketing, bienes raíces, etc. Este software no requiere habilidades de programación y codificación.
**Por qué deberías usarlo**: **Octoparse** es una plataforma de datos web SaaS gratuita de por vida. Puedes usar para [capturar datos](https://octoparse.es/) web y convertir datos no estructurados o semiestructurados de sitios web en un conjunto de datos estructurados sin codificación. También proporciona [task templates](https://helpcenter.octoparse.es/hc/es/articles/360039675314-Empieze-usar-Easy-Template-una-soluci%C3%B3n-de-web-scraping-para-principiantes) de los sitios web más populares de países hispanohablantes para usar, como Amazon.es, Idealista, Indeed.es, Mercadolibre y muchas otras. Octoparse también proporciona servicio de datos web. Puedes personalizar tu tarea de crawler según tus necesidades de scraping.
**PROS**
* Interfaz limpia y fácil de usar con un panel de flujo de trabajo simple
* Facilidad de uso, sin necesidad de conocimientos especiales
* Capacidades variables para el trabajo de investigación
* Plantillas de tareas abundantes
* Extracción de nubes
* Auto-detección
**CONS**
* Se requiere algo de tiempo para configurar la herramienta y comenzar las primeras tareas
**3.** [**Import.io**](https://www.import.io/)
**Para quién** **sirve:** Empresa que busca una solución de integración en datos web.
**Por qué deberías usarlo:** Import.io es una plataforma de datos web SaaS. Proporciona un software de web scraping que le permite extraer datos de una web y organizarlos en conjuntos de datos. Pueden integrar los datos web en herramientas analíticas para ventas y marketing para obtener información.
**PROS**
* Colaboración con un equipo
* Muy eficaz y preciso cuando se trata de extraer datos de grandes listas de URL
* Rastrear páginas y raspar según los patrones que especificas a través de ejemplos
**CONS**
* Es necesario reintroducir una aplicación de escritorio, ya que recientemente se basó en la nube
* Los estudiantes tuvieron tiempo para comprender cómo usar la herramienta y luego dónde usarla.
**4.** [**Mozenda**](https://www.mozenda.com/)
**Para quién** **sirve:** Empresas y negocios hay necesidades de fluctuantes de datos/datos en tiempo real.
**Por qué deberías usarlo:** Mozenda proporciona una herramienta de extracción de datos que facilita la captura de contenido de la web. También proporcionan servicios de visualización de datos. Elimina la necesidad de contratar a un analista de datos.
**PROS**
* Creación dinámica de agentes
* Interfaz gráfica de usuario limpia para el diseño de agentes
* Excelente soporte al cliente cuando sea necesario
**CONS**
* La interfaz de usuario para la gestión de agentes se puede mejorar
* Cuando los sitios web cambian, los agentes podrían mejorar en la actualización dinámica
* Solo Windows
**5.** [**Parsehub**](https://www.parsehub.com/)
**Para quién** **sirve:** analista de datos, comercializadores e investigadores que carecen de habilidades de programación.
**Por qué deberías usarlo:** ParseHub es un software visual de web scrapinng que puede usar para obtener datos de la web. Puede extraer los datos haciendo clic en cualquier campo del sitio web. También tiene una rotación de IP que ayudaría a cambiar su dirección IP cuando se encuentre con sitios web agresivos con una técnica anti-scraping.
**PROS**
* Tener un excelente boaridng que te ayude a comprender el flujo de trabajo y los conceptos dentro de las herramientas
* Plataforma cruzada, para Windows, Mac y Linux
* No necesita conocimientos básicos de programación para comenzar
* Soporte al usuario de muy alta calidad
**CONS**
* No se puede importar / exportar la plantilla
* Tener una integración limitada de javascript / regex solamente
**6.** [**Crawlmonster**](https://www.crawlmonster.com/)
**Para quién** **sirve:** SEO y especialistas en marketing
**Por qué deberías usarlo:** CrawlMonster es un software de web scraping gratis. Te permite escanear sitios web y analizar el contenido de tu sitio web, el código fuente, el estado de la página y muchos otros.
**PROS**
* Facilidad de uso
* Atención al cliente
* Resumen y publicación de datos
* Escanear el sitio web en busca de todo tipo de puntos de datos
**CONS**
* Funcionalidades no son tan completas
**7.** [**Connotate**](https://www.connotate.com/)
**Para quién** **sirve:** Empresa que busca una solución de integración en datos web.
**Por qué deberías usarlo:** Connotate ha estado trabajando junto con Import.io, que proporciona una solución para automatizar el scraping de datos web. Proporciona un servicio de datos web que puede ayudarlo a scrapear, recopilar y manejar los datos.
**PROS**
* Fácil de usar, especialmente para no programadores
* Los datos se reciben a diario y, por lo general, son bastante limpios y fáciles de procesar
* Tiene el concepto de programación de trabajos, que ayuda a obtener datos en tiempos programados
**CONS**
* Unos cuantos glitches con cada lanzamiento de una nueva versión provocan cierta frustración
* Identificar las faltas y resolverlas puede llevar más tiempo del que nos gustaría
**8.** [**Common Crawl**](https://commoncrawl.org/)
**Para quién** **sirve:** Investigador, estudiantes y profesores.
**Por qué deberías usarlo:** Common Crawl se basa en la idea del código abierto en la era digital. Proporciona conjuntos de datos abiertos de sitios web rastreados. Contiene datos sin procesar de la página web, metadatos extraídos y extracciones de texto.
Common Crawl es una [organización sin](https://es.wikipedia.org/wiki/Organizaci%C3%B3n_sin_%C3%A1nimo_de_lucro) [fines ...
pfq78c
u_melisaxinyue
melisaxinyue
t3_pfq78c
https://www.reddit.com/r/u_melisaxinyue/comments/pfq78c/los_30_mejores_software_gratuitos_de_web_scraping/
9/1/2021 9:18:25 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Los 30 Mejores Software Gratuitos de Web Scraping en 2021
False
1
pfq78c
0
7400
5
5
11
0.929839391377853
8
0.676246830092984
0
0
699
59.0870667793745
1183
Red
10
Dash Dot Dot
20
No
881
Posted
11/13/2020 10:02:54 AM
El web scraping (también conocido como [raspado web](https://www.octoparse.com/DataCrawler), [extracción de datos web](https://www.octoparse.es/)) significa extraer datos de websites. Por lo general, hay dos opciones para que los usuarios rastreen sitios web. Podemos construir nuestros propios rastreadores codificando o utilizando API públicas.
Alternativamente, el web scraping también se puede hacer con un software automatizado de web scraping, que se refiere a un proceso automatizado implementado usando un bot o un rastreador web. Los datos extraídos de las páginas web se pueden exportar a varios formatos o a diferentes tipos de bases de datos para su posterior análisis.
Hay muchas [herramientas de web scraping](https://www.octoparse.es/blog/las-20-mejores-herramientas-de-web-scraping) en el mercado. En esta publicación, me gustaría compartir con ustedes algunos populares raspadores (scrapers) automáticos que la gente le da una calificación alta y tendré un repaso de sus respectivos servicios destacados.
[**1. Visual Web Ripper**](http://visualwebripper.com/)
&#x200B;
https://preview.redd.it/mvwgzmqkdzy51.png?width=736&format=png&auto=webp&v=enabled&s=1a93d77a3a40e4e909e2cafad0c1c3b0bc488d27
Visual Web Ripper es una herramienta de web scraping automatizada con una variedad de características. Funciona bien para ciertos sitios web difíciles de scraping con técnicas avanzadas, como ejecutar scripts que requieren usuarios con habilidades de programación.
Esta herramienta de scraping tiene una interfaz interactiva fácil de usar para ayudar a los usuarios a comprender el proceso operativo básico rápidamente. Las características destacadas incluyen:
Extrae varios formatos de datos
Visual Web Ripper puede hacer frente a diseños de bloques difíciles, especialmente para algunos elementos web que se muestran en la página web sin una asociación HTML directa.
AJAX
Visual Web Ripper es capaz de extraer los datos proporcionados por AJAX.
Necesario Iniciar Sesión
Los usuarios pueden extraer sitios web que requieren inicio de sesión primero.
Formatos de Exportación de datos
CSV, Excel, XML, SQL Server, MySQL, SQLite, Oracle y OleDB, salida de archivo de script C# o VB personalizado (si se programa adicionalmente)
IP proxy servers
Proxy para ocultar la dirección IP
A pesar de que ofrece tantas funcionalidades, todavía no ha brindado a los usuarios un servicio basado en la nube. Eso significa que los usuarios solo pueden tener esta aplicación instalada en la máquina local y ejecutarla localmente, lo que puede limitar la escala de raspado y la eficiencia cuando se trata de una mayor demanda de raspado de datos.
Debugger
Visual Web Ripper tiene un depurador que ayuda a los usuarios a construir agentes confiables donde algunos problemas pueden resolverse de manera efectiva.
\[Pricing\]
Visual Web Ripper cobra a los usuarios de $ 349 a $ 2090 según el número de asiento del usuario suscrito. El mantenimiento durará 6 meses. Específicamente, los usuarios que compraron un solo asiento ($349) solo pueden instalar y usar esta aplicación en una sola computadora. De lo contrario, los usuarios tendrán que pagar el doble o más para ejecutarlo en otros dispositivos. Si acepta este tipo de estructura de precios, Visual Web Ripper podría aparecer en sus opciones.
&#x200B;
https://preview.redd.it/vvwrq6hmdzy51.png?width=533&format=png&auto=webp&v=enabled&s=c0ca72cbaddb081c0265453fef9af0aa92d7acf2
[**2. Octoparse**](https://www.octoparse.es/)
&#x200B;
https://preview.redd.it/05qmfrbndzy51.png?width=1920&format=png&auto=webp&v=enabled&s=490654d52251432de52a12e246e188be664ffbc7
Octoparse es un web scraping de escritorio completo y sin codificación con muchas características sobresalientes.
Proporciona a los usuarios herramientas integradas útiles y fáciles de usar.Los datos se pueden extraer de sitios web difíciles o agresivos que son difíciles de rastrear.
Su interfaz de usuario está diseñada de manera lógica, lo que la hace muy fácil de usar. Los usuarios no tendrán problemas para localizar ninguna función. Además, Octoparse visualiza el proceso de extracción utilizando un diseñador de flujo de trabajo para ayudar a los usuarios a estar al tanto del proceso de scraping para cualquier tarea. Octoparse soporta:
Bloqueo de Anuncios
El bloqueo de anuncios optimizará las tareas al reducir el tiempo de carga y la cantidad de solicitudes HTTP.
AJAX Setting
Octoparse puede extraer datos proporcionados por AJAX y establecer el tiempo de espera.
XPath Tool
Los usuarios pueden modificar XPath para localizar elementos web con mayor precisión utilizando la herramienta XPath proporcionada por Octoparse.
Regular Expression Tool
Los usuarios pueden cambiar el formato de la salida de datos extraídos con la herramienta Regex incorporada de Octoparse. Ayuda a generar una expresión regular coincidente automáticamente.
Formatos de Exportación de datos
CSV, Excel, XML, SQL Server, MySQL, SQLite, Oracle, y OleDB
IP proxy servers
Proxy para ocultar la dirección IP
Cloud Service
Octoparse ofrece un servicio basado en la nube. Acelera la extracción de datos, de 4-10 veces más rápido que la extracción local. Una vez que los usuarios usan Cloud Extraction, se asignarán de 4 a 10 servidores en la nube para trabajar en sus tareas de extracción. Liberará a los usuarios del mantenimiento prolongado y de ciertos requisitos de hardware.
API Access
Los usuarios pueden crear su propia API que devolverá datos formateados como cadenas XML.
\[Precio\]
Octoparse es de uso gratuito si no elige utilizar el Cloud Service. El raspado ilimitado de páginas es excelente en comparación con todos los otros raspadores en el mercado. Sin embargo, si desea considerar el uso de su Cloud Service para un raspado más sofisticado, ofrece dos ediciones pagas:**Estándar Plan** y **Profesional Plan**.
Ambas ediciones ofrecen un excelente servicio de scraping.
&#x200B;
https://preview.redd.it/ymibthnqdzy51.png?width=447&format=png&auto=webp&v=enabled&s=457138d250f89311b7c08658d5264d654af75d0d
**Edición estándar**: $75 por mes cuando se factura anualmente, o $89 por mes cuando se factura mensualmente.
Standard Edition ofrece todas las funciones destacadas.
Número de tareas en el Grupo de tareas: 100
Servidores en la nube: 6
**Edición profesional:** $158 por mes cuando se factura anualmente, o $189 por mes cuando se factura mensualmente.
Professional Edition ofrece todas las funciones destacadas.
Número de tareas en el Grupo de tareas: 200
Servidores en la nube: 14
Para concluir, Octoparse es un software de scraping rico en características con precios razonables.
[**3. Mozenda**](http://www.mozenda.com/)
&#x200B;
https://preview.redd.it/sctbraardzy51.png?width=754&format=png&auto=webp&v=enabled&s=b923e9e351a4a04dbaaf1162a9c3e91895282774
Mozenda es un servicio de web scraping basado en la nube. Proporciona muchas funciones útiles para la extracción de datos. Los usuarios pueden subir datos extraídos al almacenamiento en la nube.
Extrae varios formatos de datos
Mozenda puede extraer muchos tipos de formatos de datos. Sin embargo, no es tan fácil cuando se trata de datos con diseño de datos irregular.
Regex Setting
Los usuarios pueden normalizar los resultados de los datos extraídos utilizando Regex Editor dentro de Mozenda. Es posible que deba aprender a escribir una expresión regular.
Formatos de Exportación de datos
Puede soportar varios tipos de transformación de datos exportación.
AJAX Setting
Mozenda puede extraer datos proporcionados por AJAX y establecer el tiempo de espera.
\[Pricing\]
Los usuarios de Mozenda pagan por **Créditos de Página (Page Credits)**, que es el número de solicitudes individuales a un sitio web para cargar una página web. Cada plan de suscripción incluye un número fijo de páginas en el precio del paquete mensual. Eso significa que las páginas web fuera del rango de los números de página limitados se cobrarán adicionalment...
jtej4p
u_melisaxinyue
melisaxinyue
t3_jtej4p
https://www.reddit.com/r/u_melisaxinyue/comments/jtej4p/top_5_herramientas_de_web_scraping_comentario/
11/13/2020 10:02:54 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 Herramientas de Web Scraping Comentario
False
1
jtej4p
0
7400
5
5
2
0.176678445229682
5
0.441696113074205
0
0
665
58.7455830388693
1132
Red
10
Dash Dot Dot
20
No
880
Posted
11/13/2020 9:47:15 AM
Big data ha cambiado la industria del deporte. Desde la composición del equipo y la estrategia de juego hasta las operaciones de marketing; desde propietarios de equipos deportivos hasta agencias de apuestas, los deportes se comercializan y no solo van más allá de un simple evento social de reunión grupal también promueve una influencia social positiva. Forbes estimó que la industria del deporte alcanzará un valor de $73.5 mil millones en 2019. Si alguna vez se topó con las apuestas deportivas, probablemente conocía el poder de la web scraping. Cuando se trata de scrape datos deportivos de sitios web, muchas personas pensarán en usar R, Python o API de los sitios web. Pero todos ellos son difíciles para las personas sin experiencia previa en programación, como yo.
Así que aquí me gustaría presentarles los medios para que los profesionales no tecnológicos puedan extraer datos deportivos de sitios web, utilizando Octoparse, [una herramienta de web scraping](https://www.octoparse.es/) amigable para principiantes. Las ventajas que puede obtener son:
Más fácil - Operaciones visibles de apuntar y hacer clic, no se requiere programación.
Más rápido - No necesita estudiar los sitios web ni probar su código.
Varios formatos de datos: Excel, CSV, JSON, HTML o exportar a su base de datos, incluidos SQL Server, MySQL y Oracle.
**¿Dónde podrías scrape los datos deportivos?**
Para abordar esta pregunta, debemos entender para qué sirven las estadísticas deportivas. El objetivo de las estadísticas deportivas podría dividirse en dos partes: Análisis de Rendimiento y Análisis de Valor de Mercado. De alguna manera, el último se verá afectado por el primero.
Análisis de Rendimiento Deportivo requerirá la información que incluye tablas, resultados, calendario y clasificaciones. Principalmente, esta información se puede encontrar en los sitios oficiales relevantes, como NBA.com, FIFA.com, NFL.com; o algunos sitios web de terceros que brindan información congregada, como sportstats.com. Con respecto al análisis del valor de mercado, además de la información mencionada anteriormente, requiere información de las redes sociales o sitios de portal para evaluar su influencia social.
**¿Cómo puedes scrape los datos deportivos?**
En lugar de un tutorial paso a paso en un sitio web específico, prefiero mostrarle una hoja de ruta para el raspado de datos deportivos de diferentes tipos de plataformas, ayudándole a encontrar la ruta correcta para scrape datos deportivos.
**Información de Scraping Table**
La mayoría de los datos deportivos se muestran en una tabla, por lo que con el mismo flujo de trabajo de scraping, puede extraer la información de los sitios oficiales de deportes o de cualquier sitio web de terceros. Para crear el scraping crawler para recuperar información de la tabla, puede seguir estos dos artículos:
[3 Pasos para Scraping el Ranking de Juegos Masculinos de FIFA.com](https://www.octoparse.com/blog/3-steps-to-scrape-men-s-ranking-on-fifacom)
[Scraping las Probabilidades de Apuestas para Sports Analytics](https://www.octoparse.es/blog/cuotas-de-apuestas-deportivas)
**Scraping de datos de las Redes Sociales**
Para scrape las reseñas o tweets de las redes sociales para el análisis del valor de mercado, puede abrir la página de resultados de búsqueda en el navegador integrado de Octoparse, o crear tarea de scraping de entrada palabras clave. Siga las instrucciones de estos artículos:
[YouTube: Scraping Información de Video y Reseñas de la Copa Mundial 2018](https://www.octoparse.com/blog/scraping-visualizing-youtube-comments-on-2018-world-cup)
[Twitter: Scraping tweets de Twitter](https://www.octoparse.es/tutorial-7/scrape-tweets-from-twitter)
[Scraping con Palabras Clave ingresadas](https://www.octoparse.es/tutorial-7/text-input)
**Cree su Feed de Datos Deportivos Actualizado**
Si necesita crear una sports data feed, manteniendo la actualización de los datos extraídos de forma automática y continua, es posible que desee utilizar las funciones premium de Octoparse: [Cloud Extraction](https://www.octoparse.es/tutorial-7/cloud-extraction). Los beneficios incluyen:
\- La tarea de scraping se puede programar para ejecutarse en la nube en cualquier momento y frecuencia
\- Los datos extraídos pueden alimentarse programáticamente en la base de datos
\- La velocidad de recopilación de datos aumenta hasta 6-20 veces
\- Conectado con Octoparse API, puede usar la API para ingresar datos en su propio sistema
**Conclusión**
En realidad, no es necesario que descubra todos los tutoriales de raspado anteriores, pero solo uno de ellos podría ayudarlo a comprender la lógica de trabajo de las tareas de raspado, luego puede aplicar a otros sitios web similares.
jtedeo
u_melisaxinyue
melisaxinyue
t3_jtedeo
https://www.reddit.com/r/u_melisaxinyue/comments/jtedeo/web_spider_para_estadísticas_deportivas_datos/
11/13/2020 9:47:15 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Spider para Estadísticas Deportivas Datos
False
1
jtedeo
0
7400
5
5
5
0.680272108843537
2
0.272108843537415
0
0
407
55.3741496598639
735
Red
10
Dash Dot Dot
20
No
879
Posted
10/30/2020 5:58:20 AM
El mercado financiero es un lugar de riesgos e inestabilidad. Es difícil predecir cómo se desarrollará la curva y, a veces, para los inversores, una decisión podría ser un movimiento decisivo. Esto es el porqué de que los profesionales experimentados nunca dejan de prestar atención a los datos financieros.
Los seres humanos, si no tenemos una base de datos con datos bien estructurados, no podremos llegar a manejar información voluminosa. El raspado de datos es la solución que pone los datos completos al alcance de su mano.
### Tabla de contenidos
[¿Qué Estamos Extrayendo Cuando Scrapeamos Datos Financieros?](http://www.octoparse.es/blog/extraer-datos-financieros-sin-python#h1)
[¿Por Qué Extraer Datos Financieros?](http://www.octoparse.es/blog/extraer-datos-financieros-sin-python#h2)
[¿Cómo Scrapear Datos Financieros sin Python?](http://www.octoparse.es/blog/extraer-datos-financieros-sin-python#h3)
[¡Empecemos!](http://www.octoparse.es/blog/extraer-datos-financieros-sin-python#h4)
## ¿Qué Estamos Extrayendo Cuando Scrapeamos Datos Financieros?
Cuando se trata de extraer datos financieros, los datos del mercado de valores son el centro de atención. Pero hay más, precios de negociación y cambios de valores, fondos mutuos, contrato de futuros, criptomonedas, etc. Los estados financieros, los comunicados de prensa y otras noticias relacionadas con el negocio también son fuentes de datos financieros que la gente va a scrapear.
## ¿Por Qué Extraer Datos Financieros?
Los datos financieros, cuando se extraen y analizan en tiempo real, pueden proporcionar información valiosa para inversiones y comercio. Y las personas en diferentes puestos recopilan datos financieros para diversos fines.
### Predicción del mercado de valores
Las organizaciones de comercio de acciones aprovechan los datos de los portales comerciales en línea como [Yahoo Finance](https://in.finance.yahoo.com/) para mantener registros de los precios de las acciones. Estos datos financieros ayudan a las empresas a predecir las tendencias del mercado y a comprar / vender acciones para obtener las mayores ganancias. Lo mismo ocurre con las operaciones de futuros, monedas y otros productos financieros. Con datos completos a mano, la comparación cruzada se vuelve más fácil y se manifiesta una imagen más amplia.
### Análisis de renta variable
"No pongas todos los huevos en una canasta". Los gestores de Portfolio realizan estudios de renta variable para predecir el rendimiento de varias acciones. Los datos se utilizan para identificar el patrón de sus cambios y desarrollar aún más un modelo de negociación algorítmica. Antes de llegar a este fin, se involucrará una gran cantidad de datos financieros en el análisis cuantitativo.
### Análisis de sentimiento del mercado financiero
La recopilación de datos financieros no se trata simplemente de números. Las cosas pueden ir cualitativamente. Podemos encontrar que la presuposición planteada por Adam Smith es insostenible: las personas no siempre son económicas, o digamos, racionales. La economía conductal revela que nuestras decisiones son susceptibles a todo tipo de sesgos cognitivos, simplemente emociones.
Con los datos de noticias financieras, blogs, publicaciones y reseñas relevantes en las redes sociales, las organizaciones financieras pueden realizar análisis de sentimientos para captar la actitud de las personas hacia el mercado, que puede ser un indicador de la tendencia del mercado.
## ¿Cómo Scrapear Datos Financieros sin Python?
Si no sabe codificar, esté atento, déjeme explicarle cómo puede extraer datos financieros [con el apoyo de Octoparse](http://www.octoparse.es/). Yahoo Finance es una buena fuente para obtener datos financieros completos y en tiempo real. A continuación, le mostraré cómo extraer datos del sitio.
Además, hay muchas fuentes de datos financieros con información actualizada y valiosa de la que puede extraer, como [Google Finance](https://www.google.com/finance), [Bloomberg](https://www.bloomberg.com/company/), [CNNMoney](https://edition.cnn.com/business), [Morningsta](https://www.morningstar.com/)r, [TMXMoney](https://money.tmx.com/en), etc. Todos estos sitios son códigos HTML, lo que significa que todas las tablas, artículos de noticias y otros textos / URLs se pueden extraer de forma masiva mediante una herramienta de raspado web.
Para saber más sobre qué es el web scraping y para qué se utiliza, puede consultar [este artículo](https://www.octoparse.com/blog/big-data-what-is-web-scraping-and-why-does-it-matter).
## ¡Empecemos!
Hay 3 formas para obtener los datos:
📷Utilizar una plantilla de raspado web
📷Crear sus rastreadores web
📷Acudir a los servicios de extracción de datos
### 1. [Utilizar una plantilla](http://www.octoparse.es/) de raspado web de Yahoo Finance
Con el fin de ayudar a los novatos a comenzar con facilidad en el web scraping, Octoparse ofrece una variedad de [plantillas de web scraping](https://www.octoparse.com/blog/big-announcement-web-scraping-template-take-away). Estas plantillas son rastreadores preformateados y listos para usar. Los usuarios pueden elegir uno de ellos para extraer datos de las páginas respectivas al instante.
📷
La plantilla de Yahoo Finance ofrecida por Octoparse está diseñada para raspar los datos de Cryptocurrency. No se requiere más configuración. Simplemente haga clic en "probar" y obtendrá los datos en unos minutos.
📷
### 2. Crear un rastreador desde cero en 2 pasos
Además de los datos de Criptomonedas, también puede crear un rastreador desde cero en 2 pasos para extraer [índices mundiales de Yahoo Finance](https://finance.yahoo.com/world-indices). Un rastreador personalizado es muy flexible en términos de extracción de datos. Este método también se puede utilizar para extraer otras páginas de Yahoo Finance.
Paso 1: Introducir la dirección web para crear un rastreador
El bot cargará el sitio web en el navegador integrado, y un clic en el Tips Panel puede activar el proceso de detección automática y completar los campos de datos de la tabla.
📷
Paso 2: Ejecutar el rastreador para obtener datos
Cuando todos los datos deseados estén resaltados en rojo, guarde la configuración y ejecute el rastreador. Como puede ver en la ventana emergente, todos los datos se han scrapeardo correctamente. Ahora, puede exportar los datos a Excel, JSON, CSV o a su base de datos a través de API.
📷
### 3.[Servicios de extracción de datos financieros](http://www.octoparse.es/)
Si scrapea datos financieros de vez en cuando y en una cantidad bastante pequeña, puede utilizar las herramientas útiles de raspado web. Puede que encuentre algo interesante durante el proceso de construir sus propios rastreadores. Sin embargo, si necesita datos voluminosos para hacer un análisis profundo, digamos, millones de registros, y tiene un alto estándar de precisión, es mejor entregar sus necesidades de raspado [a un grupo de profesionales confiables del raspado web](https://service.octoparse.com/data-service).
#### ¿Por qué merecen la pena los servicios de raspado de datos?
1. Ahorro de tiempo y energía
Lo único que tiene que hacer es transmitir claramente al proveedor de servicios de datos qué datos desea. Una vez hecho esto, el equipo de servicio de datos se encargará del resto sin problemas. Puede sumergirse en su negocio principal y hacer lo que se le da bien. Deje que los profesionales hagan el trabajo de raspado por usted.
2. Cero curva de aprendizaje y problemas tecnológicos
Incluso la herramienta de raspado más fácil requiere tiempo para dominarla. El entorno en cambio constante en diferentes sitios web puede ser difícil de manejar. Y cuando está scrapeando a gran escala, puede tener problemas como la prohibición de IP, baja velocidad, datos duplicados, etc. El servicio de raspado de datos puede liberarlo de estos problemas.
3. Sin violaciones legales
Si no presta mucha atención a los términos de servicio de las fuentes de datos de las que está extrayendo, puede tener problema en el web scraping. Con el apoyo de un asesor experimentado, un proveedor de servicios de raspado web profesio...
jkrd86
u_melisaxinyue
melisaxinyue
t3_jkrd86
https://www.reddit.com/r/u_melisaxinyue/comments/jkrd86/3_formas_de_extraer_datos_financieros_sin_python/
10/30/2020 5:58:20 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
3 Formas de Extraer Datos Financieros SIN Python
False
1
jkrd86
0
7400
5
5
2
0.158856235107228
0
0
0
0
682
54.1699761715647
1259
Red
10
Dash Dot Dot
20
No
878
Posted
9/17/2021 7:36:55 AM
Hoy, data science ya no es la palabra de moda como el auge del mercado basado en datos. El [IBM report](https://www.ibm.com/downloads/cas/3RL3VXGA) estima que las ofertas de trabajo relacionadas con los datos aumentarán a 2.7 millones para 2021. Dicho esto, la demanda de **habilidades profesionales relacionadas** con los datos como el aprendizaje automático y la AI son imprescindibles para los talentos analíticos.
[Fuente de Imágen:https:\/\/www.octoparse.es\/blog\/10-big-data-analytics-cursos-en-linea](https://preview.redd.it/wejhz8jpn0o71.png?width=700&format=png&auto=webp&v=enabled&s=3d603444fbb9ad849e54d99c84f74692b108c5f8)
Este artículo recomienda los **10 mejores cursos** en línea para principiantes, especialmente aquellos que planean hacer a participar en trabajos de análisis de datos.
&#x200B;
**Tabla de Contenido**
Análisis de datos y habilidades de presentación: the PwC Approach Specialization
Especialización en Data Science
Especialización en Big Data
Statistics with R
Programa Profesional de Microsoft en Data Science
Análisis de Marketing
Fundamentos de Big Data
Estructuras de Datos Avanzados
Python
Tutorial de Java para Principiantes Completos
&#x200B;
## Coursera
### 1. Análisis de datos y habilidades de presentación: the PwC Approach Specialization
**Proveedor**: **Price Waterhouse Coopers LLP**
**Compromiso**: 21 semanas, 3-4 horas/semana
Esta especialización incluye 5 cursos, desde toma de decisiones basada en datos, resolución de problemaCompromiso: 21 semanas, 3-4 horas/semanas con funciones básicas de Excel, visualización de datos con Excel avanzado, hasta la presentación comercial con PowerPoint y un proyecto final.
* *Curso 1: Toma de Decisiones Basada en Datos*
* *Curso 2: Resolución de Problemas con Excel*
* *Curso 3: Visualización de Datos con Excel Avanzado*
* *Curso 4: Presentaciones Comerciales Efectivas con PowerPoint*
* *Curso 5: Análisis de Datos y Habilidades de Presentación: El Proyecto final de the PwC Approach*
**Valoración media de 4.6**
La especialización en análisis de datos está diseñada para los empleados por PWC. El enfoque de PWC está más en las aplicaciones negocios que en la teoría. Es adecuado para personas sin experiencia en programación.
**Precio:** 7 días de prueba gratuita, luego puedes continuar tus estudios por $49 al mes
&#x200B;
### 2. Especialización en Data Science
**Proveedor: John Hopkins University**
**Compromiso**: 43 semanas, 4-9 horas/semana
Compuesta por 10 cursos, esta especialización cubre los conceptos y herramientas que necesitará a lo largo de toda la línea de ciencia de datos, desde hacer el tipo correcto de preguntas hasta hacer inferencias y publicar resultados.
**Valoración media** 4.6
Esta es una de las especializaciones de ciencia de datos más largas en Coursera. A diferencia del PWC, se centra más en teorías relacionadas con estadísticas, algoritmos y análisis de datos. Además, estos cursos se basan en el lenguaje de programación R. Como resultado, se recomienda un conocimiento básico de programación antes de tomar los cursos.
**Precio:** 7 días de prueba gratuita, luego puedes continuar tus estudios por $49 al mes
&#x200B;
### 3. Especialización en Big Data
**Proveedor: Universidad de California, San Diego**
**Compromiso:** 30 semanas, 3-6 horas/semana
Con un total de 6 cursos, cubre los aspectos principales de big data, desde la introducción básica, modelado, sistemas de gestión, integración y procesamiento, hasta aprendizaje automático y análisis gráfico.
* *Curso 1: Introducción a Big Data*
* *Curso 2: Modelado de Big Data y Sistemas de Gestión*
* *Curso 3: Integración y Procesamiento de Big Data*
* *Curso 4: Descripción General del Aprendizaje Automático*
* *Curso 5: Análisis de Gráficos para Big Data*
* *Curso 6: Big Data - Proyecto Final*
**Valoración media 4.3**
Esta es una excelente introducción a Big Data para principiantes que no profundiza demasiado en la programación. No se necesita experiencia previa en programación. Implica varias herramientas de software de código abierto, incluido Apache Hadoop.
**Precio:** 7 días de prueba gratuita, luego puedes continuar tus estudios por $49 al mes
&#x200B;
### 4. Statistics with R
**Proveedor: Duke University**
**Compromiso:** 27 semanas, 5-7 horas/semana
Con los 5 cursos en esta especialización, aprenderás a analizar y visualizar datos en R. Podrás crear informes de análisis de datos reproducibles, demostrar una comprensión conceptual de la naturaleza unificada de la inferencia estadística, realizar inferencia estadística frequentist y Bayesian, y modelado.
* *Curso 1: Introducción a la Probabilidad y los Datos*
* *Curso 2: Datos Numéricos y Categóricos*
* *Curso 3: Regresión Lineal y Modelado*
* *Curso 4: Estadísticas Bayesianas*
* *Curso 5: Estadísticas con R Capstone*
**Valoración media** 4.5
El curso tiene que ver con la programación de R. Asegúrate de estar completamente preparado con habilidades de programación.
**Precio:** 7 días de prueba gratuita, luego puedes continuar tus estudios por $49 al mes
&#x200B;
## EDX
### 5. Programa Profesional de Microsoft en Data Science
**Proveedor: Microsoft**
**Compromiso:** 56-58 semanas, 2-4 horas/semana
Compuesto por 4 unidades (10 cursos en total) y un Proyecto Final. Esta especialización cubre la introducción básica de la ciencia de datos, lenguajes de programación esenciales y lenguajes de programación avanzados en ciencia de datos aplicada.
* *Unidad 1 - Fundamentos*
* *Unidad 2 - Ciencia de Datos Centrales*
* *Unidad 3 - Ciencia de Datos Aplicados*
* *Unidad 4 - Proyecto Capstone*
**Valoración media** N/A
Como era de esperar, tiene una alta conexión con el software de Microsoft, incluidos los servidores Excel, Power BI, Azure y R. Estos cursos también incluyen R y Python.
**Precio:** Gratis o pagar $99 para conseguir más recursos
&#x200B;
### 6. Análisis de Marketing
**Proveedor: University of California, Berkeley**
**Compromiso:** 16 semanas, 5-7 horas por semana.
Con los 5 cursos en esta especialización, puedes obtener un certificado y un programa de crédito después de la graduación. El programa está diseñado y enseñado por el experto de la industria Stephan Sorger, quien desempeñó un papel de liderazgo en marketing y desarrollo de productos en organizaciones como Oracle, 3Com y NASA.
* *Curso 1: Programa de MicroMasters® de Marketing Analytics de BerkeleyX*
* *Curso 2: Análisis de Marketing: Estrategia de Medición de Marketing*
* *Curso 3: Análisis de Marketing: Análisis de Precios y Promociones*
* *Curso 4: Análisis de Marketing: Análisis Competitivo y Segmentación del Mercado*
* *Curso 5: Análisis de Marketing: Productos, Distribución y Ventas*
**Valoración media** N/A
Este programa se centra más en la utilización de datos sobre planificación y decisión de marketing, incluida la estrategia de medición de marketing, análisis de precios y promociones, análisis competitivo y segmentación de mercado, distribución de productos y ventas. Hablando personalmente, es un buen curso para un vendedor digital que quiere mejorar tu habilidad numérica.
**Precio:** $896.40 para disfrutar de la experiencia completa del programa
&#x200B;
## Clase cognitiva
### 7. Fundamentos de Big Data
**Proveedor: IBM**
**Compromiso:** 13 hours
Solo consta de 3 cursos. Estos cursos ofrecen una breve introducción a Big Data, Hadoop y Spark. El curso cognitiva se conoce antes como Big Data University. Ahora lo rebautizaron como proveedor MOOC respaldado por IBM.
* *Curso 1: Big Data 101*
* *Curso 2: Hadoop 101*
* *Curso 3: Fundamentos de Spark 1*
**Valoración media** N/A
Como un programa Big Data 101, estos cursos presentan conceptos básicos sobre big data y cómo integrarse en nuestra vida diaria y trabajo. Mientras tanto, se presentan muchas herramientas de big data para mostrar cómo se capturan, procesan y visualizan los datos.
**Precio:** Gratis
**Curso alternativo:** Master of Science en Data ScienceCreador: Maryville UniversityEste programa es 100% en línea, un programa de ciencia de datos de 36 créditos diseñado para permitirte desarrollar habilidades, ...
ppvnbm
bigdata
melisaxinyue
t3_ppvnbm
https://www.reddit.com/r/bigdata/comments/ppvnbm/10_mejores_cursos_online_de_analítica_de_big_data/
9/17/2021 7:36:55 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10 Mejores Cursos Online de Analítica de Big Data en 2021
False
0.67
ppvnbm
0
7400
5
5
6
0.495867768595041
0
0
0
0
722
59.6694214876033
1210
Red
10
Dash Dot Dot
20
No
877
Posted
7/24/2020 8:53:53 AM
Por favor haga clic en el artículo original: [Cómo Web Scraping Ayuda a Hedge Funds Obtener Ventaja](http://www.octoparse.es/blog/ayuda-a-hedge-funds-a-obtener-ventaja)
Se ha vuelto imposible ocultar datos previamente ocultos. Muchas herramientas avanzadas ahora pueden extraer nuevos datos o incluso extraerlos de varias fuentes en Internet. Un análisis más profundo ha permitido a los fondos de cobertura explotar una fuente alfa importante nueva y en crecimiento.
A principios de año, Greenwich Associates y Thomson Reuters colaboraron en un estudio para ofrecer un conocimiento sobre los tremendos cambios en el panorama de la investigación de inversiones. Con el título, "**El futuro de la Investigación de Inversiones**", contiene muchos factores contribuyentes que respaldan este cambio cualitativo y tiene algunas observaciones específicamente informativas sobre datos alternativos.
https://preview.redd.it/45coxf0tqrc51.png?width=620&format=png&auto=webp&v=enabled&s=fe614a0564b6fb015a2f1fb22d2f34473d3850c3
La importancia de los conjuntos de datos alternativos había sido revisada previamente; estos incluyen [datos de geolocalización](http://www.datadriveninvestor.com/2018/10/17/your-mobile-phone-as-a-gold-mine-for-hedge-funds/) e [imágenes satelitales](http://www.datadriveninvestor.com/2018/09/18/looking-to-the-skies-for-alpha/), están demostrando que cubren fondos hay un montón de alfa sin explotar en estos conjuntos de datos para instituciones listas para invertir su dinero en su adquisición, para que puedan aprovechar las ventajas de información importante en la competencia.
Según el estudio de Greenwich/Thomson Reuters, [está claro que la empresa de inversión promedio invierte alrededor de $900,000](https://www.institutionalinvestor.com/article/b19fsq17p6zp5n/Big-Data-Too-Popular-for-its-Own-Good) en datos alternativos anualmente, mientras que sus datos alternativos tienen una estimación de los presupuestos anuales de la industria actualmente en torno a $300 millones. Esto es casi dos veces más que el año anterior. En base a estos datos, web-scraped data se han identificado como los datos más populares adoptados por los profesionales de inversión.
https://preview.redd.it/norg15juqrc51.png?width=418&format=png&auto=webp&v=enabled&s=aa11f987fbdd8775505d1176796b18847e5c77ce
Fuente:
En el proceso de web scraping (considerado como "data scraping", "spidering" o "Extracción de datos automatizada"), el software se utiliza para extraer datos que son potencialmente valiosos de fuentes en línea. Mientras tanto, para los fondos de cobertura, tener que pagar a las empresas para obtener estos datos en particular puede ayudarlos a tomar decisiones de inversión más inteligentes y razonables, incluso antes que sus competidores.
Quandl es un ejemplo de una empresa así y ahora es el centro de atracción en la [revolución de los datos alternativos](https://www.datadriveninvestor.com/2018/07/30/quandl-and-the-alternative-data-revolution/). Lo que hace esta compañía canadiense es scrape la web para compilar conjuntos de datos, o colaborar con expertos en dominios, y luego ofrecer los datos a la venta a los fondos de cobertura, así como a otros clientes que muestran interés.
Hay muchas formas de web-scraped data según lo informado por Greenwich, que incluyen información de redes de expertos, precios de productos, datos de tráfico web y tendencias de búsqueda.
Un ejemplo es cómo Goldman Sachs Asset Management scrape el tráfico web de Alexa.com, que pudo reconocer un aumento vertiginoso en las visitas al sitio web HomeDepot.com. El administrador de activos pudo adquirir las acciones antes de que la compañía aumentara su perspectiva y cosechar los beneficios cuando sus acciones finalmente se aprecian.
Entre sus diversas estrategias, una compañía de datos alternativa, [Eagle Alpha](https://eaglealpha.com/our-story/), scrape datos de precios de grandes minoristas; y esto ha demostrado ser valioso en la provisión de un indicador direccional para las ventas de productos de consumo. Por ejemplo, cuando los datos se obtienen de sitios web de electrónica en los Estados Unidos, la compañía puede observar que los productos GoPro están disminuyendo su demanda y, por lo tanto, la conclusión correcta es que el fabricante de la cámara de acción no alcanzará los objetivos 2015Q3. [Más del 68 por ciento de las recomendaciones fueron comprar las acciones ](https://www.forbes.com/sites/freddiedawson/2016/03/25/twitter-your-hedge-fund-better/#1d0ae142385e)dos días antes de que se declarara públicamente el bajo rendimiento de GoPro.
El valor de los datos de las redes sociales no puede ser subestimado. Es el conjunto de datos más grande que nos ayuda a comprender el comportamiento social y las empresas están [scraping activamente](https://www.octoparse.es/blog/5-mejores-rastreadores-web-de-redes-sociales) estos datos para descubrir su valor oculto.
Según un[ informe reciente ](https://www.bloomberg.com/professional/blog/trading-twitter-evolving-market/)de Bloomberg, "El flujo de Twitter proporciona conjuntos de datos alternativos muy grandes y saludables, particularmente para los investigadores que buscan alpha", el servicio de Bloomberg’s noticias recién lanzado toma en las noticias relacionadas con finance-related twitter feed y escaneó valiosos tweets de noticias para perspectivas de inversión. Énfasis adicional
Por el valor de los datos de las redes sociales, se descubrió que "los movimientos de Dow Jones pueden predecirse mediante estados de ánimo colectivos obtenidos directamente de los feeds a gran escala de Twitter, con una precisión de alrededor [del 87,6 por ciento](https://arxiv.org/pdf/1010.3003v1.pdf).
[EY lanzó una encuesta en noviembre de 2017](https://www.ey.com/en_gl/wealth-asset-management/how-will-you-use-innovation-to-illuminate-competitive-advantages) y descubrió que los datos de las redes sociales estaban siendo utilizados o utilizados por más de una cuarta parte de los fondos de cobertura en sus estrategias de inversión dentro de 6-12 meses. Los proveedores obtienen personalmente los datos de fuentes como Facebook, YouTube y Twitter, o, a veces, a través de herramienta de web scraping como [**Octoparse**](https://www.octoparse.es/).
Cuando los sitios web populares a los que se puede acceder fácilmente, como Amazon y Twitter, activamente be scrapped. Los fondos de cobertura se impulsarán a buscar regularmente fuentes de datos nuevas y especiales para sacar a la luz, señales comerciales precisas para permanecer en la cima de su juego. Por esta razón, no habrá fin a cuán profundamente pueden profundizar las empresas. [La dark web](https://www.fnlondon.com/articles/hedge-funds-gain-an-edge-from-the-dark-web-20170802) puede incluso estar incluida.
https://preview.redd.it/61ywx5jxqrc51.png?width=620&format=png&auto=webp&v=enabled&s=578baf5fabb7c69bf47cd9a8d6ebd0238f4bd039
Los datos scraped pueden incluso incluir datos de clientes o individuos, especialmente los que pueden extraerse de diferentes fuentes, como antecedentes penales, registros de vuelo, directorios telefónicos y registros electorales. Con base en los argumentos que giran en torno a los problemas con los datos personales que ganaron popularidad este año, particularmente con el surgimiento del escándalo de Cambridge Analytica en Facebook, los scrappers pronto encontrarán una fuerte oposición de los promotores de leyes de privacidad de datos.
Tammer Kamel, CEO y Fundador de Quandl, ha [declarado recientemente](https://www.ft.com/content/08a22da8-b587-11e6-ba85-95d1533d9a62) que existe una "healthy paranoia" entre las diferentes organizaciones para eliminar la información personal antes de las ventas de los conjuntos de datos alternativos de su empresa, y ese paso en particular puede acarrear graves consecuencias. En cualquier caso, la protección reglamentaria adecuada es primordial en este nivel. Esto implica que se puede recopilar demasiada información con respecto a un individuo, ya que todavía no tenemos un conjunto de normas de gobierno.
El año pasado, el Informe de Ley de Hedge Fund [declaró](https://www.hflawreport.com/2552996/best-pr...
hwye2v
hedgefund
melisaxinyue
t3_hwye2v
https://www.reddit.com/r/hedgefund/comments/hwye2v/cómo_web_scraping_ayuda_a_hedge_funds_obtener/
7/24/2020 8:53:53 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Cómo Web Scraping Ayuda a Hedge Funds Obtener Ventaja
False
1
hwye2v
0
7400
5
5
15
1.24584717607973
9
0.747508305647841
0
0
646
53.6544850498339
1204
Red
10
Dash Dot Dot
20
No
876
Posted
9/14/2021 8:15:14 AM
Las reglas de viaje están cambiando actualmente con la curva de casos de Covid. Con la variante Delta de la enfermedad, los casos están aumentando. Mientras estoy compilando este artículo, la UE está considerando volver a imponer restricciones de viaje a los visitantes estadounidenses.
De todos modos, he creado mi raspador de Tripadvisor con Octoparse y he analizado la información de los destinos que están abiertos a los ciudadanos estadounidenses. Prepárate siempre para un viaje refrescante.
Nota: si te diriges a [estos países](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html), es posible que desees comprobar si es necesaria la vacunación o la cuarentena.
Por cierto, el web scraping es definitivamente la mejor manera de ayudarnos a extraer los datos web y así poder examinarlos y sacar el máximo provecho de ellos. Mostraré cómo me ayuda a obtener los datos de viaje.
&#x200B;
https://preview.redd.it/u7o5uvo6gfn71.jpg?width=698&format=pjpg&auto=webp&v=enabled&s=3cd02312f1643c5178399c4906241094ef8c2b27
Mapa geográfico generado por [mapchart.net](https://mapchart.net/)
## Tabla de Contenido
* Web Scraping de datos de viajes
* ¿A dónde puede ir un estadounidense?
* Crear un raspador de TripAdvisor
## Web Scraping de Datos de Viajes
¿Tienes alguna idea sobre [el big data en el turismo?](https://www.octoparse.es/blog/big-data-en-turismo)
Los empresarios de la industria de viajes están rastreando todo tipo de datos, por ejemplo, datos comerciales de agentes de viajes y datos de comportamiento de los visitantes en todas las plataformas relacionadas con viajes. Es posible que conozcan sus hábitos de viaje mejor que tú. Toda la industria está aprovechando el big data para lanzar el producto adecuado y encontrar a las personas adecuadas para pagar por sus servicios.
El web scraping es la tecnología que lo hace posible.
Bueno, como viajero, quiero recopilar datos de viajes en la web para satisfacer mis necesidades: encontrar destinos entre los más atractivos y obtener las guías de Tripadvisor para mi referencia.
**Que voy a hacer**
* En primer lugar, necesito una lista de países para investigar.
* En segundo lugar, utilizaré una herramienta de raspado web, Octoparse, para crear un raspador de Tripadvisor y rastrear los datos de viajes de estos países.
* ¡Finalmente, voy a empacar mi equipaje y dirigirme al destino que más se ajuste a mis gustos de viaje!
## ¿A Dónde Puede Ir un Estadounidense?
Entonces, ¿a dónde puede viajar un estadounidense ahora?
[Este artículo de CNN](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html) enumeró los destinos que están abiertos a los EE. UU. (La lista podría actualizarse de vez en cuando).
Lo que quería hacer era extraer todos los nombres de países de esta página web en una hoja de cálculo para poder pegarlos en Octoparse y obtener datos más específicos de Tripadvisor.
&#x200B;
https://preview.redd.it/ebjsm0zbgfn71.jpg?width=699&format=pjpg&auto=webp&v=enabled&s=2a80334cda9a587d961724848ca811e42c3bd3ec
Octoparse: cómo obtener información de la lista en una página web en Excel
Octoparse puede obtener fácilmente información de la lista en una página web en Excel o CSV.
Esto es extremadamente útil cuando deseas obtener una lista de URL o una lista de datos, que deseas pegar y buscar en otra plataforma, o importar a un software de análisis de datos para tu análisis.
Ahora que tengo la lista de destinos de texto, voy a crear un raspador de TripAdvisor para obtener datos específicos sobre estos lugares.
## Crear un Raspador de TripAdvisor
Los datos que voy a rastrear desde Tripadvisor:
* Quiero comprobar la popularidad de los viajes en estos países. Consultaré con el número de reseñas sobre el país en Tripadvisor. (Mi hipótesis: más visitas, más reseñas).
* Tengo mi tema de viaje. Soy un amante de la naturaleza interesado en eventos al aire libre y turismo en la naturaleza. Obtendré la información de la etiqueta de estos destinos para poder filtrar y ubicar el lugar perfecto donde pueda perseguir el viento, jugar en la playa o apreciar la grandeza de un pico.
* Guardaré la URL de las guías de viaje en Tripadvisor para una mayor planificación de viajes. (¡Gracias contribuidores!)
### Generar URL por Lotes con Nombres de Países
¿Dónde conseguir estos datos? Esta es una página de muestra: [Tripadvisor Nepal](https://www.tripadvisor.com/Search?q=Nepal&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1&rf=2).
Con la lista de nombres de países que he extraído en el paso anterior, puedo generar por lotes todas las páginas de países de Tripadvisor con Octoparse.
&#x200B;
https://preview.redd.it/pjoqa3mfgfn71.jpg?width=696&format=pjpg&auto=webp&v=enabled&s=b17463060051833c91d96fce7479253b70223499
Octoparse: generar URL por lotes con un parámetro
**Ejemplos de páginas generadas:**
[Tripadvisor Ireland](https://www.tripadvisor.com/Search?q=Ireland&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1)
[Tripadvisor Israel](https://www.tripadvisor.com/Search?q=Israel&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1)
[Tripadvisor Italy](https://www.tripadvisor.com/Search?q=Italy&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1)
[Tripadvisor Kenya](https://www.tripadvisor.com/Search?q=Kenya&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1)
Ahora que tengo una lista de páginas web de destino para extraer datos, voy a crear un raspador que comprenda qué datos estoy solicitando y los tomará por mí.
### Crear un Raspador: Dime Lo Que Quieres
Construir un raspador es como compilar una carta para conversar con la computadora: dígale dónde y cómo obtener los datos que deseas. Solo que no hablas en lenguaje humano, sino en lenguajes de programación.
Y una herramienta de raspado web es como un traductor. Te permite compilar la carta utilizando lenguaje humano, gracias al flujo de trabajo comprensible y la interfaz de usuario intuitiva.
Si esto sigue siendo abstracto, no importa. Vamos a sumergirnos en algunas preguntas.
**¿Qué puede hacer un raspador?**
* Visitar - abrir una página web.
* Hacer clic - hacer clic en un enlace de la página web.
* Extraer - rastrear datos como textos, URL, números, etc.
**¿Qué datos necesito?**
* El nombre del país, el número de reseñas.
* El enlace de la guía de viaje, el título de la guía y sus etiquetas.
**¿Cómo actuará un raspador para obtener los datos que necesito?**
* Visitará la pagina web
* Extraerá el nombre del país y el número de reseñas en la página
* Buscará el enlace de la guía de viaje y hará clic en él
* Extraerá la URL de la página, el título de la guía, las etiquetas de la guía
* Regresará y visitará la siguiente página web
* Repetirá los pasos anteriores (en Octoparse, esto se puede hacer con un [bucle](https://helpcenter.octoparse.es/hc/es/articles/360055946274-Elemento-de-bucle))
Bingo. Ese es el flujo de trabajo que construí aquí.
https://preview.redd.it/yfrwj07igfn71.jpg?width=640&format=pjpg&auto=webp&v=enabled&s=8a9b71806668e75c0c8cb11fe7266263dc45e24d
Octoparse: cómo funciona el flujo de trabajo de un raspador web
**¿Cómo construir el flujo de trabajo?**
Pan comido.
* Ingresar las URL en la barra de búsqueda y comenzar una tarea de construcción. (Díle al raspador qué páginas web visitar)
* Hacer clic en los datos que deseas en el navegador integrado. (Ayuda al raspador a localizar los datos)
* Seleccionar las acciones que deseas que realice el raspador en el Panel de sugerencias. (Díle al raspador que visite, ...
pnym31
webscraping
melisaxinyue
t3_pnym31
https://www.reddit.com/r/webscraping/comments/pnym31/tripadvisor_scraper_los_principales_destinos/
9/14/2021 8:15:14 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Tripadvisor Scraper: los principales destinos abiertos a los ciudadanos bajo Covid
False
0.2
pnym31
0
7400
5
5
Red
10
Dash Dot Dot
20
No
875
Posted
10/29/2021 7:36:31 AM
Hola, todos. ¡[Octoparse](https://www.octoparse.es/) versión en español ya está disponible! Octoparse es una herramienta con la que puedes extraer datos web fácilmente sin codificación.
https://preview.redd.it/ikxk5q85hcw71.png?width=1278&format=png&auto=webp&v=enabled&s=67b45f40011655e0f28641e1ce4fd5d6c4d6c842
[Haz clic aquí para ver el video](https://youtu.be/6aXtSo-eiZM)
En la versión 8.4, Octoparse puede exportar automáticamente tus datos en la nube con Zapier a Google Drive, Google Sheet y más software. Zapier es una herramienta que te ayuda a integrar flujos de trabajo entre diferentes aplicaciones sin necesidad de código.
Cuando ocurre un evento en una aplicación, Zapier se activará para decirle a otra aplicación que realice una acción en particular, de acuerdo con el Zap que hayas creado. Conectarte con Zapier, te ayuda a automatizar tu trabajo y tener más tiempo para lo que más importa con miles de aplicaciones más populares.
¡Y hay más! Puedes personalizar el agente de usuario, hacer una copia de seguridad de los datos locales en la nube y formatear la marca de tiempo. Siempre hay más de los que esperas.
qi7f9o
u_melisaxinyue
melisaxinyue
t3_qi7f9o
https://www.reddit.com/r/u_melisaxinyue/comments/qi7f9o/octoparse_84_herramienta_de_web_scraping_en/
10/29/2021 7:36:31 AM
10/29/2021 7:50:57 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse 8.4 | Herramienta de Web Scraping en Español
False
1
qi7f9o
0
7400
5
5
0
0
1
0.591715976331361
0
0
80
47.3372781065089
169
Red
10
Dash Dot Dot
20
No
874
Posted
11/1/2021 4:21:08 AM
[removed]
qk69xr
analyzit
melisaxinyue
t3_qk69xr
https://www.reddit.com/r/analyzit/comments/qk69xr/un_marco_para_informes_de_análisis_de_datos/
11/1/2021 4:21:08 AM
1/1/0001 12:00:00 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Un Marco Para Informes de Análisis de Datos
False
0.75
qk69xr
0
7400
5
5
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
873
Posted
10/13/2021 7:04:04 AM
Usuarios de Octoparse, ¿cómo va su viaje de raspado web con el software? En este mes, se lanzará la versión 8.4.2 del producto. ¿Quieren saber qué novedades presenta la próxima versión más reciente? ¡Sigue leyendo!
https://preview.redd.it/8roxoyd726t71.png?width=1600&format=png&auto=webp&v=enabled&s=52c07555edec6c76b61d5a678d6d73b15003f873
## 1. Integración de Zapier
En la versión 8.4.2, puedes exportar automáticamente tus datos en la nube con [Zapier](https://zapier.com/) a Google Drive, Google Sheet y más software.
[Octoparse x Zapier](https://preview.redd.it/2vopj10b26t71.png?width=600&format=png&auto=webp&v=enabled&s=3efbb3bd45cd860fdb4e985718f08e9175aecde0)
[Encuentra más información aquí y pruébalo.](https://zapier.com/apps/google-drive/integrations/octoparse)
## 2. Raspar mientras se desplaza dentro de una sección determinada
Tomamos Google Maps como ejemplo. Puedes ingresar a la página web y raspar los resultados de la búsqueda solo usando esta función en la versión 8.4.2. La función se puede implementar configurando el [Xpath](https://www.octoparse.es/blog/como-encontrar-xpath-para-localizar-datos-en-una-pagina-web).
https://preview.redd.it/898ua79e26t71.png?width=599&format=png&auto=webp&v=enabled&s=244ea336c8a09fc865216af5eed9a40767fb6471
## 3. Personalizar el agente de usuario
Puedes cambiar la cadena del agente de usuario y el nombre del agente de usuario en los navegadores cuando utilices la versión 8.4.2 para extraer datos.
Para entender cómo funcionan los agentes de usuario, este artículo puede ser útil: [Cómo cambiar los agentes de usuario en Chrome, Edge, Safari y Firefox](https://www.searchenginejournal.com/change-user-agent/368448/)
## 4. Realizar una copia de seguridad de los datos locales en la nube
Esta función solía estar disponible sólo para los usuarios empresariales. En la nueva versión 8.4.2, está abierta también a los usuarios con planes profesionales.
## 5. Formateo de la marca de tiempo
Esta función está diseñada principalmente para raspar plataformas de redes sociales. [La conversión de la marca de tiempo de las publicaciones a la fecha](https://timestamp.online/) está disponible en la versión 8.4.2.
## 6. Otras actualizaciones en las funciones existentes y la interfaz de usuario (UI)
Con las actualizaciones, la versión 8.4.2 será más estable y conveniente de usar en comparación con las versiones anteriores.
¡Qué más! El sistema de Octoparse 8.4.2 ahora está disponible en español, puedes cambiar el idioma según tu necesidad.
No dudes en contactarnos en [support@octoparse.com](mailto:support@octoparse.com) o [enviar un ticket](https://helpcenter.octoparse.es/hc/es/requests/new) aquí si tienes alguna pregunta. El equipo de atención al cliente estará listo para ayudarte como siempre. ¡Te deseo un raspado aún más feliz!
q75oip
u_melisaxinyue
melisaxinyue
t3_q75oip
https://www.reddit.com/r/u_melisaxinyue/comments/q75oip/qué_novedades_presenta_octoparse_842/
10/13/2021 7:04:04 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
¿Qué novedades presenta Octoparse 8.4.2?
False
1
q75oip
0
7400
5
5
2
0.479616306954436
0
0
0
0
211
50.599520383693
417
Red
10
Dash Dot Dot
20
No
872
Posted
11/13/2020 9:36:07 AM
¿Puedes creer que el 70% del tráfico de Internet fue creado por arañas \*? ¡Es sorprendentemente cierto! Hay muchas arañas, [web crawlers](https://www.octoparse.es/DataCrawler) o buscadores de robots ocupados con sus trabajos en Internet. Simulan el comportamiento humano, recorren sitios web, hacen clic botones, comprobación de datos y recuperación de información.
&#x200B;
Con tanto tráfico generado, deben haber logrado algo magnífico. Algunos de ellos pueden sonarle familiares, como el monitoreo de precios en negocios de comercio electrónico, el monitoreo de redes sociales en relaciones públicas o la adquisición de datos de investigación para estudios académicos. Aquí, nos gustaría profundizar en las 3 aplicaciones de raspado web que rara vez se conocen pero que son sorprendentemente *rentables*.
**1. Transportation**
Las plataformas Budget Airline son muy populares entre los web scrapers debido a la promoción impredecible de boletos baratos.
La intención original de Airline’s es ofrecer boletos baratos al azar para atraer turistas\*, pero los revendedores encontraron una manera de obtener ganancias de esto. Los geek scalper usan web crawlers para actualizar continuamente el sitio web de la aerolínea. Una vez que haya un boleto barato disponible, el crawler lo reservará boletos.
&#x200B;
AirAsia, por ejemplo, solo mantiene la reserva durante una hora y si el pago no se realiza para entonces, el boleto se envía de vuelta al grupo de boletos y está listo para vender. Los revendedores volverán a comprar el boleto en el milisegundo después de que el boleto regrese a la piscina, y así sucesivamente. Solo hasta que solicite el boleto a un revendedor, los revendedores utilizarán el programa para abandonar el boleto en el sistema [AirAsia](https://www.airasia.com/en/gb) y en 0.00001 segundos después, lo reservarán con su nombre.
No me parece un trabajo fácil. Tal vez la comisión de agencia intermedia es bastante justa.
**2. E-Commerce**
Debe haber muchos sitios de comparación de precios y Sitio web de devolución de efectivo en los favoritos de un comprador inteligente. No importa cuáles sean sus nombres, "plataformas de comparación de precios", "sitio web agregado de comercio electrónico" o "sitios web de cupones", las ideas son las mismas: ganar dinero ahorrando su dinero.
&#x200B;
Scraping los precios y las [imágenes](https://www.octoparse.com/blog/free-image-extractors-around-the-web) de los [websites de comercio electrónico](https://www.octoparse.com/blog/scrape-javascript-rendered-webpages) y los muestran en sus propios sitios web.
Los magnates del comercio electrónico como [Amazon](https://www.octoparse.es/tutorial-7/scrape-product-information-from-amazon) saben que es difícil revertir las tendencias del código abierto \*. Entonces comienzan un negocio para vender [su API](https://aws.amazon.com/api-gateway/pricing/?nc1=h_ls) con una larga lista de precios. ¡Parece que los sitios web de comparación de precios necesitan hacer todos los trabajos, escribir códigos para scrape, proporcionar un canal de ventas y pagar las facturas!
&#x200B;
No tan rapido. Déjame decirte cómo estas plataformas de comercio electrónico agregadas obtienen sus ganancias:
&#x200B;
* Supongamos que hay una plataforma que agrega varias tiendas que venden Okamoto. Cuando las personas buscan el producto, pueden decidir qué tienda ocupa el primer lugar y cuál es el último. En resumen, la tienda que paga más dinero a la plataforma se clasifica en la parte superior.
&#x200B;
* Si el precio de oferta de clasificación es tan dolorosa (@Google AdWords), también puede tomar la salida fácil: comprar anuncios en la página web. Cada vez que un visitante hace clic su tienda en línera, el sitio web gana dinero - por cobrar por los clics.
* El sitio web también puede actuar como agente para ganar comisiones. Es bastante fácil de entender, ya que ayudan a las tiendas a vender productos. Basado en la creencia de "cuanto más, mejor", allí tenemos sitios web de devolución de efectivo que nacen.
En resumen, lo están haciendo bastante bien.
**3. Seguridad social**
Para dos compañías que aceptan compartir bases de datos a través de API, aún podrían necesitar el raspado web para convertir los datos de las páginas web en informes de datos estructurales.
Tomemos los tres grandes como ejemplo. Equifax, Experian y TransUnion, que poseen los archivos de crédito de 170 millones de adultos estadounidenses y venden más de 600 millones de informes de crédito cada año, generando más de $ 10 mil millones en ingresos\*.
&#x200B;
"En realidad, es un modelo de negocio bastante simple. Recopilan la mayor cantidad de información sobre usted de los prestamistas, la agregan y se la devuelven", dijo Brett Horn, analista de [la industria de Morningstar \*](https://www.usatoday.com/story/money/personalfinance/2017/10/06/equifax-makes-money-knowing-lot-you/738824001/).
Reciben sus datos de comportamiento personal de sus bancos, empleadores, tribunales locales, centros comerciales e incluso una estación de servicio. Con tantos informes para [analizar](https://www.octoparse.es/blog/30-mejores-herramientas-de-big-data-para-datos-analisis), el web-scraping es una gran ayuda para organizar los datos. Pueden convertir las páginas web en informes de datos estructurales.
Hay muchas formas de scrape desde los sitios web. Si desea scrape datos a escala de muchos sitios web, una herramienta de raspado web es útil. Aquí hay una lista de las [10 mejores herramientas de web scraping](https://www.octoparse.es/blog/mejores-datos-scraping-herramientas-2020) como preferencia.
El web scraping es una forma increíble de recopilar datos para su negocio. Si está buscando un servicio de raspado web confiable para raspar datos de la web, puede intentar iniciarlo con [Octoparse](https://www.octoparse.es/download).
Ref:
\*[http://dopro.io/web-spider-function.html](http://dopro.io/web-spider-function.html)
\*[https://www.eyefortravel.com/mobile-and-technology/scraping-single-biggest-threat-travel-industry](https://www.eyefortravel.com/mobile-and-technology/scraping-single-biggest-threat-travel-industry)
\*[https://www.scmp.com/lifestyle/travel-leisure/article/2168635/five-tips-using-bookingcom-skyscanner-expedia-and-other](https://www.scmp.com/lifestyle/travel-leisure/article/2168635/five-tips-using-bookingcom-skyscanner-expedia-and-other)
\*h[ttps://www.cnbc.com/2017/10/03/it-costs-consumers-4-point-1-billion-to-freeze-credit-reports.html](https://www.cnbc.com/2017/10/03/it-costs-consumers-4-point-1-billion-to-freeze-credit-reports.html)
\*[https://thenextweb.com/dd/2017/02/15/why-enterprises-should-embrace-open-source/](https://thenextweb.com/dd/2017/02/15/why-enterprises-should-embrace-open-source/)
\*[https://www.usatoday.com/story/money/personalfinance/2017/10/06/equifax-makes-money-knowing-lot-you/738824001/](https://www.usatoday.com/story/money/personalfinance/2017/10/06/equifax-makes-money-knowing-lot-you/738824001/)
\*[https://www.cnbc.com/2017/10/03/it-costs-consumers-4-point-1-billion-to-freeze-credit-reports.html](https://www.cnbc.com/2017/10/03/it-costs-consumers-4-point-1-billion-to-freeze-credit-reports.html)
jte9gy
u_melisaxinyue
melisaxinyue
t3_jte9gy
https://www.reddit.com/r/u_melisaxinyue/comments/jte9gy/3_web_scraping_aplicaciones_para_ganar_dinero/
11/13/2020 9:36:07 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
3 Web Scraping Aplicaciones para Ganar Dinero
False
1
jte9gy
0
7400
5
5
2
0.175592625109745
6
0.526777875329236
0
0
665
58.3845478489903
1139
Red
10
Dash Dot Dot
20
No
871
Posted
10/30/2020 4:29:33 AM
La extracción de datos de web está ganando terreno como una de las mejores formas de recopilar datos útiles para impulsar el negocio de manera rentable. Aunque la extracción de datos web ya existía desde hace mucho tiempo, nunca ha sido utilizado con tanta frecuencia como hoy en día. Esta guía tiene como objetivo ayudar a los novatos de raspado web a tener una idea general de la extracción de datos de web.
### Tabla de contenido
**Que es la extracción de datos web**
**Beneficios de la extracción de datos web**
**Cómo funciona la extracción de datos web**
**Extracción de datos web para no programadores**
**Aspectos legales de la extracción de datos web**
**Conclusiones**
## Qué es la extracción de datos web
La extracción de datos de web es una práctica de copia masiva de datos realizada por bots. Tiene muchos nombres, dependiendo de cómo la gente quiera llamarlo, raspado web, raspado de datos, rastreo web, etc. Los datos extraídos (copiados) de Internet se pueden guardar en un archivo en su computadora o base de datos.
## Beneficios de la extracción de datos web
Las empresas pueden obtener muchos beneficios de la extracción de datos web. Se puede usar más ampliamente de lo esperado, pero es importante señalar cómo se usa en algunas áreas.
**1 Monitoreo de precio de comercio electrónico**
La importancia del monitoreo de precios habla por sí sola, especialmente cuando vende artículos en un mercado en línea como Amazon, eBay, Lazada, etc. Estas plataformas son transparentes, es decir, los compradores, también cualquiera de sus competidores, tienen fácil acceso a los precios , inventarios, reseñas y todo tipo de información para cada tienda, lo que significa que no solo puede concentrarse en el precio, sino que también debe vigilar otros aspectos de sus competidores. Por lo tanto, además de los precios, hay más cosas para explorar. El seguimiento de precios puede ser más que precios.
La mayoría de los minoristas y proveedores de comercio electrónico tratan de poner en línea mucha información sobre sus productos. Esto es útil para que los compradores lo evalúen, pero también es demasiada exposición para los propietarios de la tienda porque con dicha información, los competidores pueden saber cómo maneja su negocio. Afortunadamente, puede utilizar estos datos para hacer lo mismo.
También debe recopilar información de sus competidores, como precio, niveles de inventario, descuentos, rotación de productos, nuevos artículos agregados, nuevas ubicaciones agregadas, categoría de producto ASP, etc. Con estos datos en mano, puede impulsar su negocio con los siguientes beneficios que ofrece la extracción de datos web.
&#x200B;
1. Aumentarlos márgenes y las ventas ajustando los precios en el momento adecuado en los canales adecuados.
2. Mantenero mejorar su competitividad en el mercado.
3. Mejorarsu gestión de costes utilizando los precios de la competencia como base de negociación con los proveedores o revisar sus propios gastos generales y costes de producción.
4. Pensaren estrategias de precios efectivas, especialmente durante la promoción, como ventas de fin de temporada o temporadas de vacaciones.
**2 Análisis de marketing**
Casi todo el mundo puede iniciar su propio negocio siempre que se conecte a Internet gracias a la fácil entrada que ofrece la Internet mágica. Los negocios que surgen cada vez más en Internet significa que la competencia entre los minoristas será más feroz. Para que su empresa se destaque y mantenga un crecimiento sostenible, puede hacer más que simplemente reducir su precio o lanzar campañas publicitarias. Podrían ser productivos para una empresa en una etapa inicial, mientras que a largo plazo, debe estar atento a lo que están haciendo otros jugadores y condicionar sus estrategias al entorno en constante cambio.
Puede estudiar a sus clientes y a sus competidores raspando los precios de los productos, el comportamiento de los clientes, las reseñas de productos, los eventos, los niveles de existencias y las demandas, etc. Con esta información, obtendrá información sobre cómo mejorar su servicio y sus productos y cómo mantenerse entre sus competidores. Las herramientas de extracción de datos web pueden agilizar este proceso, proporcionándole información siempre actualizada para el análisis de marketing.
Obtiene una mejor comprensión de las demandas y comportamientos de sus clientes, y luego encuentra algunas necesidades específicas de los clientes para hacer ofertas exclusivas.
&#x200B;
1. Analizarlas opiniones y comentarios de los clientes sobre los productos y servicios de sus competidores para realizar mejoras en su propio producto.
2. Realizarun análisis predictivo para ayudar a prever tendencias futuras, planificar estrategias futuras y optimizar oportunamente su priorización.
3. Estudiarlas copias e imágenes de productos de sus competidores para encontrar las formas más adecuadas de diferenciarse de ellos.
**3 Generación de líder**
No hay duda de que ser capaz de generar más clientes potenciales es una de las habilidades importantes para hacer crecer su negocio. ¿Cómo generar leads de forma eficaz? Mucha gente habla de ello, pero pocos saben cómo hacerlo. La mayoría de los vendedores, sin embargo, siguen buscando clientes potenciales en Internet de forma manual y tradicional. Qué típico ejemplo de perder el tiempo en trivia.
Hoy en día, los vendedores inteligentes buscarán clientes potenciales con la ayuda de herramientas de raspado web, a través de las redes sociales, directorios en línea, sitios web, foros, etc., para ahorrar más tiempo para trabajar en sus prometedores clientes. Simplemente deje este trabajo de copia de prospectos aburrido y sin sentido a sus rastreadores.
Cuando utilice un rastreador web, no olvide recopilar la siguiente información para el análisis de clientes potenciales. Después de todo, no vale la pena dedicar tiempo a todos los clientes potenciales. Debe priorizar los prospectos que están listos o dispuestos a comprarle.
&#x200B;
1. Información personal: nombre, edad, educación, número de teléfono, puesto de trabajo, correo electrónico
2. Información de la empresa: industria, tamaño, sitio web, ubicación, rentabilidad
A medida que pase el tiempo, recopilará muchos clientes potenciales, incluso los suficientes para crear su propio CRM. Al tener una base de datos de direcciones de correo electrónico de su público objetivo, puede enviar información, boletines, invitaciones para un evento o campañas publicitarias de forma masiva. ¡Pero tenga cuidado con el spam!
## ¿Cómo funciona la extracción de datos web?
Después de saber que puede beneficiarse de una herramienta de extracción de datos web, es posible que desee crear una por su cuenta para cosechar los frutos de esta técnica. Es importante comprender primero cómo funciona un rastreador y en qué se construyen las páginas web antes de comenzar su viaje de extracción de datos web.
&#x200B;
1. Cree un rastreador con lenguajes de programación y luego introduzca la URL de un sitio web del que desea extraer. Envíe una solicitud HTTP a la URL de la página web. Si el sitio le otorga acceso, responderá a su solicitud devolviendo el contenido de las páginas web.
&#x200B;
1. Analizarla página web es solo la mitad del web scraping. El raspador inspecciona la página e interpreta una estructura de árbol del HTML. La estructura de árbol funciona como un navegador que ayudará al rastreador a seguir las rutas a través de la estructura web para obtener los datos.
&#x200B;
1. Después de eso, la herramienta de extracción de datos web extrae los campos de datos que necesita para rasparlos y almacenarlos. Por último, cuando finalice la extracción, elija un formato y exporte los datos raspados.
El proceso de raspado web es fácil de entender, pero definitivamente no es fácil crear uno desde cero para personas sin conocimientos técnicos. Afortunadamente, existen muchas herramientas gratuitas de extracción de datos web gracias al desarrollo de big data. Estén atentos, hay algunos raspadores agradables y gratuitos que me encan...
jkq9wx
u_melisaxinyue
melisaxinyue
t3_jkq9wx
https://www.reddit.com/r/u_melisaxinyue/comments/jkq9wx/extracción_de_datos_de_web_la_guía_definitiva_de/
10/30/2020 4:29:33 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Extracción de Datos de Web: La Guía Definitiva de 2020
False
1
jkq9wx
0
7400
5
5
1
0.0791765637371338
0
0
0
0
688
54.4734758511481
1263
Red
10
Dash Dot Dot
20
No
870
Posted
10/30/2020 4:30:44 AM
Sin duda alguna, el web scraping tiene ventajas. Es rápido, rentable y puede recopilar datos de sitios web con una precisión de más del 90%. Le libera de copiar y pegar en documentos de diseño desordenado. Sin embargo, es posible que algo haya sido ignorado. Existen algunas limitaciones e incluso riesgos que se esconden detrás del web scraping.
#### Haga clic para leer:
[¿Qué es el web scraping y para qué se utiliza?](http://www.octoparse.es/blog/limitaciones-de-web-scraping#h1)
[¿Cuál es la mejor forma de extraer datos web?](http://www.octoparse.es/blog/limitaciones-de-web-scraping#h2)
[¿Cuáles son las limitaciones de las herramientas de web scraping?](http://www.octoparse.es/blog/limitaciones-de-web-scraping#h3)
[Para terminar](http://www.octoparse.es/blog/limitaciones-de-web-scraping#h4)
## ¿Qué es el web scraping y para qué se utiliza?
Para aquellos que no están familiarizados con el web scraping, permítanme explicarles. El web scraping es una técnica que se utiliza para extraer información de sitios web a gran velocidad. Puede acceder a los datos extraídos y guardados en el local en cualquier momento. El web scraping funciona como uno de los primeros pasos en el análisis de datos, visualización de datos y minería de datos, ya que recopila datos de muchas fuentes. Preparar los datos es un requisito previo para la visualización o análisis en el futuro. Eso es obvio. ¿Cómo podemos empezar a hacer web scraping?
## ¿Cuál es [la mejor forma de extraer datosde la web](http://www.octoparse.es/)?
Existen algunas técnicas comunes para extraer datos de las páginas web, que vienen con algunas limitaciones. Puede crear su propio rastreador utilizando lenguajes de programación, subcontratar sus proyectos de raspado web o utilizar una herramienta de raspado web. Sin un contexto específico, no existe "la mejor manera de hacer web scraping". Piense en su conocimiento básico de codificación, su tiempo disponible y su presupuesto financiero, tendrá su propia elección.
\> Por ejemplo, si es un codificador experimentado y confía en sus habilidades de codificación, claro que puede extraer datos usted mismo. Pero como cada sitio web necesita un rastreador, tendrá que crear varios rastreadores para diferentes sitios. Esto le puede gastar mucho tiempo. Y debe estar equipado con suficientes conocimientos de programación para el mantenimiento de los rastreadores. Piénselo.
\> Si es dueño de una empresa con un gran presupuesto que desea obtener datos precisos, la historia sería diferente. Olvídese de la programación, simplemente contrata a un grupo de ingenieros o subcontrata tu proyecto a profesionales.
\> Hablando de subcontratación, puede encontrar algunos freelancers en línea que ofrecen estos servicios de recolección de datos. El precio unitario parece bastante asequible. Sin embargo, si calcula cuidadosamente con la cantidad de sitios y la cantidad de artículos que planea obtener, el gasto total puede crecer exponencialmente. Las estadísticas muestran que para extraer información de 6000 productos de Amazon, las cotizaciones de las empresas de web scraping tienen un promedio de 250 dólares para la configuración inicial y 177 dólares para el mantenimiento mensual.
\> Si es propietario de una pequeña empresa o simplemente necesita datos sin conocimientos de codificación, la mejor opción es elegir una herramienta de raspado adecuada que se adapte a sus necesidades. Como referencia, puede consultar esta lista de [Los 30 Mejores Software Gratuitos de Web Scraping.](http://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping)
📷
## ¿Cuáles son las limitaciones de las herramientas de web scraping?
#### 1. Curva de aprendizaje
Incluso la herramienta de raspado más fácil requiere tiempo para dominarla. Algunas herramientas, como [Apify](https://apify.com/), aún requieren conocimientos de codificación para usarla. Algunas herramientas que no son fáciles de manejar pueden tardar semanas en aprender. Para raspar sitios web con éxito, es necesario tener conocimientos sobre XPath, HTML, AJAX. Hasta ahora, la forma más fácil de raspar sitios web es utilizar [plantillas de raspado web prediseñadas](https://www.octoparse.com/blog/big-announcement-web-scraping-template-take-away) para extraer datos con unos clics.
#### 2. La estructura de los sitios web cambia con frecuencia
Los datos extraídos se organizan de acuerdo con la estructura del sitio web. A veces, vuelve a visitar un sitio y encontrará que el diseño ha cambiado. Algunos diseñadores actualizan constantemente los sitios web para mejorar la interfaz de usuario, algunos pueden hacerlo con el fin de anti-scraping. El cambio puede ser pequeño como un cambio de posición de un botón o puede ser un cambio drástico del diseño general de la página. Incluso un cambio menor puede estropear sus datos. Como los rastreadores se construyen de acuerdo con el sitio anterior, debe ajustar sus rastreadores cada pocas semanas para obtener los datos correctos.
#### 3. No es fácil manejar sitios web complejos.
Aquí viene otro complicado desafío técnico. Si observa el raspado web en general, el 50% de los sitios web son fáciles de scraspear, el 30% son moderados y el último 20% es bastante difícil de hacer web scraping. Algunas herramientas de raspado están diseñadas para extraer datos de sitios web simples que aplican navegación numerada. Sin embargo, hoy en día, más sitios web están comenzando a incluir elementos dinámicos como AJAX. Los sitios grandes como Twitter aplican un [desplazamiento infinito](https://www.octoparse.com/tutorial-7/infinite-scrolling-and-load-more) y algunos sitios web necesitan que los usuarios hagan clic en el botón "cargar más" para seguir cargando el contenido. En este caso, los usuarios requieren una herramienta de raspado más funcional.
#### 4. Extraer datos a gran escala es mucho más difícil
Algunas herramientas no pueden extraer millones de registros, ya que solo pueden manejar un raspado a pequeña escala. Esto causa dolores de cabeza a los propietarios de negocios de comercio electrónico que necesitan millones de líneas de datos regulares directamente en su base de datos. Los raspadores basados en la nube como [Octoparse](http://www.octoparse.es/) y [Web Scraper](https://webscraper.io/) funcionan bien en términos de extracción de datos a gran escala. Las tareas se ejecutan en varios servidores en la nube. Obtiene una velocidad rápida y un espacio gigantesco para la retención de datos.
#### 5. Una herramienta de web scraping no es omnipotente
¿Qué tipo de datos se pueden extraer? Principalmente, los textos y URLs.
Las herramientas avanzadas pueden extraer textos del código fuente (HTML interno y externo) y usar expresiones regulares para reformatearlo. En el caso de las imágenes, solo se pueden extraer sus URLs y convertirlas en imágenes más tarde. Si tiene curiosidad sobre cómo extraer URL de imágenes y descargarlas en masa, puede echar un vistazo a Cómo construir un rastreador de imágenes sin codificación.
Además, es importante tener en cuenta que la mayoría de los raspadores web no pueden rastrear archivos PDF, ya que analizan elementos HTML para extraer los datos. Para extraer datos de archivos PDF, necesita otras herramientas como [Smallpdf](https://smallpdf.com/pdf-to-excel) y [PDFelements](https://pdf.wondershare.com/pdfelement.html).
#### 6. Su IP puede ser prohibida por el sitio web de destino.
Captcha molesta. ¿Alguna vez se le ocurre que necesita superar un captcha al raspar de un sitio web? Tenga cuidado, eso podría ser una señal de detección de IP. Raspar un sitio web genera mucho tráfico, lo que puede sobrecargar un servidor web y causar pérdidas económicas al propietario del sitio. Para evitar bloquearse, existen muchos [trucos](https://www.octoparse.com/blog/scrape-websites-without-being-blocked). Por ejemplo, puede configurar su herramienta para simular el comportamiento de navegación normal de un humano.
#### 7. Incluso hay algunos problemas legales involucrados
¿Es legal el web scraping? Es posible que un simple "sí" o "no" no cubra todo...
jkqahz
u_melisaxinyue
melisaxinyue
t3_jkqahz
https://www.reddit.com/r/u_melisaxinyue/comments/jkqahz/las_7_limitaciones_del_web_scraping_que_debe/
10/30/2020 4:30:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Las 7 Limitaciones del Web Scraping que Debe Conocer
False
1
jkqahz
0
7400
5
5
1
0.0785545954438335
1
0.0785545954438335
0
0
727
57.1091908876669
1273
Red
10
Dash Dot Dot
20
No
869
Posted
11/13/2020 10:13:09 AM
Guardar una imagen de la página web es sencillo. Simplemente haga clic derecho y seleccione "save image as". Pero, ¿qué pasa si tiene cientos o incluso miles de imágenes que deben guardarse? ¿Funcionará el mismo truco? ¡Al menos no para mí!
En este artículo, quiero mostrarle cómo crear rápidamente un rastreador de imágenes con **ZERO codificaciones**. Incluso si no tienes absolutamente ningún conocimiento técnico, deberías ser capaz de lograrlo en 30 minutos. Es posible que necesite estas imágenes para volver a bloguear, revender o capacitar habilidades, el mismo truco puede extenderse literalmente a cualquier sitio web. Listo? Empecemos.
**Instalaciones**
Necesitará las siguientes herramientas:
• [**Octoparse**](https://www.octoparse.es/download): una herramienta de web scraping visual sin codificación
• [**TabSave**](https://chrome.google.com/webstore/detail/tab-save/lkngoeaeclaebmpkgapchgjdbaekacki): Complemento de Chrome para guardar imágenes al instante al proporcionar una lista de URL
**Prerrequisitos**
Sería mejor si está familiarizado con [**how Octoparse works**](https://www.octoparse.es/blog/mineria-de-texto-con-octoparse) en general. Echa un vistazo a [**Octoparse Scraping 101**](https://helpcenter.octoparse.es/hc/es/categories/360002855514-FAQ) si eres nuevo en la herramienta.
**Creando un proyecto**
¡No todas las imágenes son iguales! Algunas imágenes se pueden obtener directamente de la página web, otras imágenes se activan solo haciendo clic en las miniaturas. Bueno, en este tutorial, le mostraré cómo lidiar con cada uno de estos escenarios a través de algunos ejemplos.
**Ejemplo 1: Recuperar Imágenes Directamente de la página web**
Para demostrarlo, vamos a scrape las imágenes de los perros de Pixabay.com. Para seguir, busque "dogs" en Pixabay.com, entonces debería llegar a [**esta página**](https://pixabay.com/zh/images/search/dogs/).
1) Haga clic en "+ Task" para comenzar una nueva tarea en Modo Avanzado. Luego, ingrese la URL de la página web de destino en el cuadro de texto y haga clic en "Save URL".
2) A continuación, le diremos al bot qué imágenes buscar.
Haga clic en la primera imagen. El panel de Action Tips ahora lee "Image selected, 100 similar images found". Esto es genial, exactamente lo que necesitamos. Continúe para seleccionar "Select all", luego "Extract image URL in the loop".
&#x200B;
3) Por supuesto, no solo queremos las imágenes de la página 1, sino de imágenes de todas las páginas (o tantas páginas como sea necesario).
Para hacer esto, desplácese hacia abajo hasta la parte inferior de la página actual, ubique el botón "next page" y haga clic en él.
Obviamente queremos hacer clic en el botón "next page" muchas veces, por lo que tiene sentido seleccionar "Loop click the selected link" en el panel de Consejos de acción.
&#x200B;
Ahora, solo para confirmar si todo se configuró correctamente. Cambie el interruptor de flujo de trabajo en la esquina superior derecha. El flujo de trabajo terminado debería verse así
&#x200B;
Además, verifique el panel de datos y asegúrese de que los datos deseados se hayan extraído correctamente.
&#x200B;
3) Solo hay una cosa más para ajustar antes de ejecutar el crawler.
Durante la depuración, noté que el código fuente HTML se actualiza dinámicamente a medida que uno se desplaza hacia abajo en la página web. En otras palabras, si la página web no se desplaza hacia abajo, no podremos obtener las URL de imagen correspondientes del código fuente. Por suerte para nosotros, Octoparse se desplaza automáticamente hacia abajo fácilmente.
Tendremos que agregar el desplazamiento automático tanto cuando el sitio web se carga por primera vez como cuando se pagina.
Haga clic en "Ir a la página web" desde el flujo de trabajo. En el lado derecho del flujo de trabajo, localice "Advanced options", marque "Scroll down to the bottom of the page when finish loading".
Luego, decida cuántas veces desplazarse y a qué ritmo. Aquí configuro tiempos de desplazamiento = 40, intervalo = 1 segundo, desplazamiento = desplazamiento hacia abajo para una pantalla. Esto básicamente significa que Octoparse se desplazará hacia abajo una pantalla 40 veces con 1 segundo entre cada desplazamiento.
No se me ocurrió esta configuración al azar, pero hice un pequeño ajuste para asegurarme de que esta configuración funciona. También noté que era esencial usar "Scroll down for one screen" en lugar de "desplazarse hacia abajo hasta la parte inferior de la página". Principalmente porque solo necesitamos actualizar gradualmente la URL de la imagen en el código fuente.
&#x200B;
Aplique la misma configuración al paso de paginación.
Haga clic en "Click to paginate" en el flujo de trabajo, use exactamente la misma configuración que el desplazamiento automático
&#x200B;
4) Eso es todo. ¡Estás listo! ¿No es esto demasiado bueno para ser verdad? Ejecutemos el crawler y veamos si funciona.
Haga clic en "Start Extraction" en la esquina superior izquierda. Elija "extracción local". Básicamente significa que ejecutará el crawler en su propia computadora en lugar del servidor de la nube. \[[Descargue el archivo del crawler](https://www.dropbox.com/s/sivivqqzck61326/Pixabay_example1.otd?dl=0) utilizado en este ejemplo y [pruébelo usted mismo](https://helpcenter.octoparse.com/hc/en-us/articles/360025323191-How-do-I-import-a-task-file-and-run-it-to-get-the-data-needed-)\]
&#x200B;
**Ejemplo 2: Scrape imágenes de tamaño completo**
Pregunta: ¿Qué sucede si necesita imágenes de tamaño completo?
Para este ejemplo, utilizaremos el mismo sitio web: [https://pixabay.com/images/search/dogs/](https://pixabay.com/images/search/dogs/) para demostrar cómo puede obtener imágenes de tamaño completo.
1) Inicie una nueva tarea haciendo clic en "+ Task" en el modo Avanzado.
&#x200B;
2) Ingrese la URL de la página web de destino en el cuadro de texto y luego haga clic en "Save URL" para continuar.
3) A diferencia del ejemplo anterior donde (podíamos capturar las imágenes directamente), ahora necesitaremos hacer clic en cada imagen individual para ver/captuar la imagen a tamaño completo.
Haga clic en la primera imagen, el panel de Consejos de acción debería leer "Image selected, 100 similar images found".
Seleccione "Select all".
&#x200B;
**Luego, "Haz clic en cada imagen".**
&#x200B;
4) Ahora que tenemos a la página con la imagen a tamaño completo, las cosas son mucho más fáciles.
Haga clic en la imagen a tamaño completo, luego seleccione "Extract the URL of the selected image".
&#x200B;
Como siempre, verifique el panel de datos y asegúrese de que los datos deseados se hayan extraído correctamente.
&#x200B;
5) Siga los mismos pasos en el Ejemplo 1 para agregar pasos de paginación.
Haga clic en "Go to the webpage", ubique el botón "Next page" y luego haga clic en él. Seleccione "Loop clicked the selected link" en el panel “Action Tips".
&#x200B;
El workflow terminado debería verse así,
&#x200B;
Si no se ve igual. Arrástrelo para moverlo.
6) ¡Listo! Prueba ejecutar el crawler. \[Descargue el [archivo del crawler](https://www.dropbox.com/s/1uuc3kr9vqswgjc/Pixabay_example2.otd?dl=0) utilizado en este ejemplo y [pruébelo usted mismo](https://helpcenter.octoparse.com/hc/en-us/articles/360018324071-How-to-download-images-from-a-list-of-URLs-)\]
&#x200B;
**Ejemplo 3: Obtener imagen a tamaño completo de la miniatura**
&#x200B;
Estoy seguro de que ha visto algo similar cuando compra en línea o si tiene una tienda en línea. Para las imágenes de productos, las imágenes en miniatura son definitivamente las formas más comunes de visualización de imágenes. El uso de miniaturas reduce sustancialmente el ancho de banda y el tiempo de carga, lo que hace que sea mucho más amigable para las personas navegar a través de diferentes productos.
Hay dos formas de extraer las imágenes de tamaño completo de las miniaturas usando Octoparse.
**Opción 1**\- Puede configurar un clic de bucle para hacer clic en cada una de las miniaturas y luego extraer la imagen a tamaño completo una vez cargada....
jtemvi
u_melisaxinyue
melisaxinyue
t3_jtemvi
https://www.reddit.com/r/u_melisaxinyue/comments/jtemvi/cómo_construir_un_scraper_de_imágenes_sin/
11/13/2020 10:13:09 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Cómo Construir un Scraper de Imágenes sin Codificación
False
1
jtemvi
0
7400
5
5
5
0.387897595034911
3
0.232738557020946
0
0
719
55.7796741660202
1289
Red
10
Dash Dot Dot
20
No
868
Posted
11/13/2020 9:40:24 AM
Desde el estallido del nuevo coronavirus contagioso en el aire, la vida de millones de personas se ha visto afectada y las noticias relevantes han explotado en todas las plataformas.
En esta situación, pensamos que sería necesario [recopilar datos en tiempo real](https://helpcenter.octoparse.es/hc/es) de fuentes oficiales y no oficiales para que el público pueda tener una comprensión imparcial de este brote de epidemia con fuentes de datos transparentes.
Para obtener datos de estas fuentes, puede aprovechar las herramientas de web scraping como Octoparse, ya que hemos creado [web scraping templates](https://www.octoparse.com/blog/big-announcement-web-scraping-template-take-away) para extraer datos sobre el informe del gobierno de China. Esto puede mantenerlo actualizado con la información más reciente. Ahora echemos un vistazo a cómo usar la plantilla para extraer datos en vivo.
**Paso 1: Inicie Octoparse en su computadora y cree una tarea de scraping haciendo clic en "**[**Task Template**](https://www.octoparse.es/tutorial-7/empieze-usar-easy-template)**".**
https://preview.redd.it/xdjkpkii9zy51.png?width=497&format=png&auto=webp&v=enabled&s=cd63a49d3f85a5f29fc921cb85a7d7acfea3b669
Aviso: Hay un número de "recetas" de scraping que van desde sitios web de comercio electrónico hasta canales de redes sociales. Estos son rastreadores preformateados que se pueden usar para extraer datos de sitios web de destino directamente. Puede consultar este artículo para tener una mejor idea de [qué es una plantilla de web scraping](https://www.octoparse.es/tutorial-7/wizard-mode).
**Paso 2: En la categoría "Tiempo Real", elija "comisión nacional de salud".**
&#x200B;
https://preview.redd.it/qgbidclj9zy51.png?width=473&format=png&auto=webp&v=enabled&s=3312683d91a438757fcb449e3b634253dbbad736
Verás dos plantillas. Una es para extraer [noticias y anuncios del gobierno](http://www.nhc.gov.cn/). El otro es el [website de noticias Tencent](https://news.qq.com/zt2020/page/feiyan.htm), que está directamente conectado con la Comisión de Salud central y local de China. Hasta ahora, este es el método más rápido para obtener datos en vivo, incluidos los casos confirmados, la recuperación, el número de muertos y la tasa de mortalidad en cada ciudad de China.
&#x200B;
https://preview.redd.it/cik0uigk9zy51.png?width=520&format=png&auto=webp&v=enabled&s=9a3d2a3c6b2da0e6862f865915889bdd9ef611fc
**Paso 3: Haga clic en "datos en tiempo real 2019-nCov", ya que queremos recopilar datos en vivo.**
No hay necesidad de configuración. Simplemente inicie la extracción y Octoparse automáticamente raspará los datos a gusto. Puede exportar los datos a muchos formatos, como Excel, JSON, CSV, y a su propia base de datos a través de API. Así es como se ve la salida de datos en Excel.
&#x200B;
https://preview.redd.it/qg010u2l9zy51.png?width=505&format=png&auto=webp&v=enabled&s=541681bed7d3021f9654f9da9ebf0c23c5b0697f
**También puede extraer información en tiempo real en los canales de redes sociales. Hay plantillas que cubren plataformas populares como Facebook, Twitter, Instagram y YouTube.**
Por ejemplo, si desea extraer los últimos tweets sobre el virus y ver cómo reaccionan las personas, puede aprovechar la plantilla de "últimos tweets". Está diseñado para recopilar los últimos tweets que contienen la palabra clave de búsqueda que ingresó. Le permite extraer web page URL, tweet URL, los controladores, posts, etc.
&#x200B;
https://preview.redd.it/s8v3w2rl9zy51.png?width=512&format=png&auto=webp&v=enabled&s=673dfbc4a51fb056b390c0760c382deec8399ac8
Ahora ejecutemos esta plantilla.
**Paso 1: Abra Twitter, escriba "coronavirus" y haga clic en la pestaña "más reciente"**.[Copie la URL y péguela en el primer parámetro](https://twitter.com/search?q=coronavirus&src=recent_search_click&f=live).
&#x200B;
https://preview.redd.it/9vvmwapm9zy51.png?width=487&format=png&auto=webp&v=enabled&s=f2ab9b7f996656162ad1fa4dbcf7cc3ea59f38cb
**Paso 2: Ingrese un número en el segundo parámetro.**
Twitter aplica una técnica de [desplazamiento infinito](https://www.octoparse.es/tutorial-7/infinite-scrolling-and-load-more), lo que significa que tenemos que establecer "scrolling number" hasta que obtengamos el número deseado de publicaciones. Puede establecer cualquier número que desee de 1 a 10,000. Esta idea es para cargar la página completamente. Por ejemplo, si ingresa el número 10, el bot se desplazará 10 veces.
**Paso 3: Ejecute el scraper haciendo clic en "save and run" y obtendrá los resultados al instante.**
&#x200B;
https://preview.redd.it/q9x37kan9zy51.png?width=492&format=png&auto=webp&v=enabled&s=a0f4deb24929960d82d6605d27d8e8f620dee315
En este video hemos cubierto cómo usar plantillas de web scraping para recopilar datos en tiempo real sobre el coronavirus. Si también desea construir su propio scraper para extraer artículos de portales de noticias como Wall Street Journal, New York Times y Reuters, puede ver este video.
jteaxz
u_melisaxinyue
melisaxinyue
t3_jteaxz
https://www.reddit.com/r/u_melisaxinyue/comments/jteaxz/web_scraping_cómo_obtener_coronavirus_covid19/
11/13/2020 9:40:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping: Cómo obtener Coronavirus (COVID-19) Datos
False
1
jteaxz
0
7400
5
5
4
0.606980273141123
1
0.151745068285281
0
0
355
53.8694992412747
659
Red
10
Dash Dot Dot
20
No
867
Posted
9/23/2021 6:35:35 AM
Cuando se resume un proyecto, es inevitable formar un informe de análisis de datos relativamente completo.
El informe también requiere múltiples situaciones. De acuerdo con la aplicación, se puede dividir en muchos tipos: algunos necesitan informar al correo electrónico, otros necesitan dar una explicación al equipo del proyecto y otros deben mostrarse e informarse directamente. Según el tipo de proyecto, también se puede dividir en varios tipos: evaluación del efecto del lanzamiento de un nuevo proyecto, resultados de la [prueba AB](https://es.wikipedia.org/wiki/A/B_testing), resumen de datos diarios, análisis de datos de actividad, etc.
Ya sea el texto o la diapositiva, las ideas centrales del informe de [análisis de datos](https://www.octoparse.es/blog/30-mejores-herramientas-de-big-data-para-datos-analisis) son todas iguales.
## Tabla de Contenido
1. Debes tener una "historia"
2. Un marco para los informes de análisis de datos
3. Conclusión
https://preview.redd.it/5tfcye5477p71.png?width=700&format=png&auto=webp&v=enabled&s=fc9da999d0ac08d14c4e57a645a5481ee0859633
## 1. Debes tener una "historia"
Mi propia idea es que los gerentes de producto deben aprender más conocimientos en campos relacionados, como aprender algunas especificaciones básicas de diseño, principios de interacción, conocimiento de marketing, conocimiento de psicología, conocimiento de algoritmos, etc. Además de una ayuda obvia para el trabajo, también puede ayudarlo a expandir su pensamiento. De hecho, para hacer un buen informe, debe aprender de agencias consultoras o instituciones de inversión.
El núcleo de un informe no contiene mucho contenido para que la audiencia o los lectores dediquen tiempo a comprenderlo, el núcleo es contar una **historia simple**. Antes de que las instituciones de consultoría e inversión hagan plan de negocios, se tomarán un tiempo para aclarar el storyline. De hecho, todo tipo de informes deberían ser así, primero aclara la historia que quieres contar.
## 2. Un marco para los informes de análisis de datos
Aquí hay un marco de informe que personalmente me gusta, que puede necesitar ser ajustado para diferentes escenarios de informes (como eliminar algunos pasos o agregar algunos detalles):
* **Antecedentes del proyecto**: describir brevemente los antecedentes relevantes del proyecto, por qué se realiza y cuál es su propósito.
* **Avance del proyecto**: resumir el avance general del proyecto y la situación actual.
* **Explicación del término**: ¿Cuál es la definición de indicadores clave y por qué?
* **Método de** [**adquisición de datos**](https://es.wikipedia.org/wiki/Adquisici%C3%B3n_de_datos): cómo muestrear y cómo adquirir ¿Cuáles son los problemas?
* **Descripción general de los datos**: tendencias de indicadores importantes, cambios y explicación de la causa del importante punto de inflexión.
* **División de datos**: dividir diferentes dimensiones según la necesidad para complementar los detalles.
* **Resumen**: resumir las principales conclusiones del análisis de datos anterior como una descripción general.
* **Mejora de seguimiento**: analizar los problemas existentes y dar soluciones para mejorar y prevenir.
* **Agradecimiento & Adjunto**: datos detallados.
**Antecedentes del proyecto & Avance del proyecto**
Antecedentes del proyecto, es necesario describir brevemente los antecedentes relevantes del proyecto, por qué se realiza y cuál es el propósito. Avance del proyecto,hay que resumir el avance general del proyecto y la situación actual. De hecho, no hay mucho que decir sobre estos dos puntos. Si el objetivo es un miembro del proyecto, puedes escribirlo de forma más sencilla. Si el objetivo es alguien que no comprende el proyecto, debes escribir más, pero aún así intentar uses las palabras más simples para explicar a los demás.
**Explicación del término** **&** **Método de adquisición de datos**
Explicación del término:¿Cuál es la definición de indicadores clave y por qué? Muchas personas pasan por alto este punto. De verdad, muchos malentendidos de los datos se deben a la falta de una definición unificada de los indicadores. Por ejemplo, la tasa de clics puede ser el número de clics / el número de vistas, o el número de clics de personas / el número de visitas de personas. El número de personas se puede deduplicar según las visitas o se pueden deduplicar según el día. Si no hay una explicación clara, diferentes personas entienden de manera diferente y la legibilidad de todos los datos se reducirá en gran medida.
Método de adquisición de datos:cómo muestrear y cómo adquirir ¿Cuáles son los problemas? Los datos originales a menudo tienen algunas deficiencias. Los datos deben limpiarse para eliminar el ruido y también se requieren algunas suposiciones para completar los datos. El método de limpieza y finalización de datos debe ser explicado y reconocido por el objeto de informe, de modo que la otra parte tenga una estimación del nivel de confianza.
**Descripción general de los datos** **&** **División de datos**
La descripción general de los datos debe tener tendencias de indicadores importantes, cambios y explicación de la causa del importante punto de inflexión.
La división de datos debe dividir diferentes dimensiones según la necesidad para complementar los detalles.
Este es básicamente el método de análisis de datos mencionado anteriormente. Si necesitas que la otra parte conozca la comparación o la tendencia, uses el gráfico, si necesitas que la otra parte conozca los datos específicos, uses la tabla. La tabla debe identificar claramente los números que deben enfatizarse. Los puntos a tener en cuenta son: los indicadores básicos deben ser pocos pero críticos, y los indicadores divididos deben ser significativos y detallados. Al mismo tiempo, si se trata de una diapositiva, basta con explicar una conclusión o explicar claramente una tendencia en cada página. La conclusión clave debe expresarse claramente en una oración.
**Resumen** **&** **Mejora de seguimiento**
Resumen,debes resumir las principales conclusiones del análisis de datos anterior como una descripción general.
Mejora de seguimiento,necesitas realizar una explicación direccional para iteraciones posteriores y medidas de mejora en base a las conclusiones y problemas del análisis de datos. Esta parte suele ser el propósito fundamental del análisis.
**Agradecimiento & Adjunto**
Los agradecimientos son el agradecimiento al equipo del proyecto y a los departamentos de asistencia relacionados, para el equipo del proyecto y los departamentos de asistencia relevantes, también esperan que su trabajo o cooperación activa pueda ver resultados de datos efectivos. En la cooperación posterior, será más armonioso.
El archivo adjunto es un suplemento de elección y no es necesario reflejarlo en el informe de datos, pero sigue siendo información valiosa. Para la diapositiva, esta parte también se puede colocar después del agradecimiento. Si tu colega tiene alguna pregunta, puede pasar a la última explicación en cualquier momento.
## 3. Conclusión
Un producto, si no puedes medirlo, no puedes entenderlo y, naturalmente, no puedes mejorarlo. Se trata de [datos](https://dataservice.octoparse.com/servicio-de-datos).El significado del informe de datos es similar: una vez finalizado el proyecto, se requiere un informe completo, por lo que es de gran importancia tanto para el informe como para el equipo.
ptpbkf
u_melisaxinyue
melisaxinyue
t3_ptpbkf
https://www.reddit.com/r/u_melisaxinyue/comments/ptpbkf/un_marco_para_informes_de_análisis_de_datos/
9/23/2021 6:35:35 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Un Marco Para Informes de Análisis de Datos
False
1
ptpbkf
0
7400
5
5
0
0
1
0.0901713255184851
0
0
592
53.3814247069432
1109
Red
10
Dash Dot Dot
20
No
866
Posted
11/13/2020 10:00:03 AM
El [raspado web](http://octoparse.es/)/web scraping es el procesamiento de extraer contenido específico de un sitio web sin acceder a una API para obtener el contenido.
**Cómo construir un crawler:**
Para programadores o desarrolladores, el uso de python es la forma más común de construir un web scraper/[crawler](https://www.octoparse.es/blog/como-construir-un-crawler) para extraer contenido web. Por ejemplo, el código en la captura de pantalla a continuación se puede usar para extraer datos de un sitio web público: pokemondb.net.
https://preview.redd.it/hqq6kvh8dzy51.png?width=888&format=png&auto=webp&v=enabled&s=e3d71b3f502c22dbe5d3141b979b619cd893c482
(imagen de /gist.github.com/anchetaWern/6150297)
Para la mayoría de las personas que no tienen habilidades de codificación, sería mejor usar algunos extractores de contenido web para obtener contenido específico de las páginas web. A continuación se presentan algunas soluciones con Octoparse:
**1. Extraer contenido de la página web dinámica**
Las páginas web pueden ser estáticas o dinámicas. A menudo, el contenido web que desea extraer cambiará cada momento. A menudo, el sitio web aplicará la técnica AJAX. Ajax permite que la página web envíe y reciba datos del fondo sin interferir con la visualización de la página web. En este caso, puede marcar la opción AJAX para permitir que Octoparse extraiga contenido de páginas web dinámicas.
https://preview.redd.it/rgt28ab9dzy51.png?width=1014&format=png&auto=webp&v=enabled&s=0bacad258a3979e8a692244d65cffebdc9e7449a
Compruebe la configuración del tiempo de AJAX timeout en Octoparse
**2. Extraiga el contenido oculto de la página web.**
¿Alguna vez ha querido obtener datos específicos de un sitio web pero el contenido aparecería después de activar un enlace o pasar el puntero del mouse? Por ejemplo, cierta información de contacto en craigslist.org aparecerá después de hacer clic en el botón Reply.
https://preview.redd.it/zxy4wq6adzy51.png?width=619&format=png&auto=webp&v=enabled&s=fafe591d7c5a2ba10bc76482182bb4da7dc81130
De hecho, dicho contenido oculto se puede encontrar en el código fuente HTML de esta página web. Octoparse puede extraer el texto entre el código fuente. Es fácil usar el comando "Click Item" o el comando "Cursor sobre" debajo del panel "Action Tip" para lograr la acción de extracción.
https://preview.redd.it/2wl6cc2bdzy51.png?width=515&format=png&auto=webp&v=enabled&s=c8399079ecdee629e02adcf44648610811aed214
**3. Extraiga contenido de la página web con desplazamiento infinito**
También puede notar que algunos mensajes solo se cargan una vez que se desplaza hacia la parte inferior de la página web, como Twitter. Esto se debe a que los sitios web aplican desplazamiento infinito. El desplazamiento infinito generalmente acompaña a AJAX o JavaScript para que las solicitudes sucedan cuando llegue al final de la página web. En este caso, puede establecer el tiempo de espera de AJAX, seleccionar el método de desplazamiento y los tiempos de desplazamiento para personalizar cómo desea que el robot extraiga el contenido.
https://preview.redd.it/77nz670cdzy51.png?width=1010&format=png&auto=webp&v=enabled&s=ba7512c81ad3bd76dcbfda5690464a9c98231c76
Marque la opción "Scroll Down" en Octoparse para extraer contenido.
**4. Extraer hipervínculos de la página web**
Un websites normal contendrá al menos un hipervínculo y si desea extraer todos los enlaces de una página web, puede usar Octoparse para ayudarlo a extraer todas las URL de todos websites.
**5. Extraer texto de la página web**
Si desea extraer el lugar del contenido entre etiquetas HTML, como la etiqueta <DIV> o la etiqueta <SPAN>. Octoparse le permite extraer todo el texto entre el código fuente.
**6. Extraer URL de imágenes de la página web**
Octoparse no pudo descargar la imagen pero puede descargar la URL de la imagen.
&#x200B;
https://preview.redd.it/7805yxeddzy51.png?width=348&format=png&auto=webp&v=enabled&s=dd0c02ac8891aaf9554c8a30675ba1ec1a8c1f50
**Conclusión**
Octoparse puede extraer todo lo que se muestra en la página web y exportarlo a formatos estructurados como Excel, CSV, HTML, TXT y otras bases de datos. Sin embargo, Octoparse ahora no puede descargar imágenes, videos, GIF y lienzos. Esperamos que en el futuro cercano, estas funciones se agreguen a la versión actualizada. Haga clic [AQUÍ](https://www.octoparse.com/download/) para descargar Octoparse y aprender más de [ricos tutoriales](https://www.octoparse.com/tutorial/).
jtehzx
u_melisaxinyue
melisaxinyue
t3_jtehzx
https://www.reddit.com/r/u_melisaxinyue/comments/jtehzx/extraer_contenido_de_una_página_web/
11/13/2020 10:00:03 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Extraer Contenido de una Página Web
False
1
jtehzx
0
7400
5
5
1
0.165016501650165
1
0.165016501650165
0
0
338
55.7755775577558
606
Red
10
Dash Dot Dot
20
No
865
Posted
5/21/2020 9:04:55 AM
[removed]
gntedx
webscraping
melisaxinyue
t3_gntedx
https://www.reddit.com/r/webscraping/comments/gntedx/3_web_scraping_aplicaciones_para_ganar_dinero/
5/21/2020 9:04:55 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
3 Web Scraping Aplicaciones para Ganar Dinero
False
0.44
gntedx
0
7400
5
5
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
864
Posted
11/13/2020 9:56:30 AM
2020 está destinado a ser un año de [web scraping](https://www.octoparse.es/). Las empresas compiten entre sí con información masiva recopilada de una multitud de usuarios, ya sea por sus comportamientos de consumo, contenido compartido en las redes sociales. Por lo tanto, debe construir sus activos de datos para tener éxito.
Muchas empresas e industrias siguen siendo vulnerables en el ámbito de los datos. Una [encuesta realizada](http://newvantage.com/wp-content/uploads/2017/01/Big-Data-Executive-Survey-2017-Executive-Summary.pdf) en 2017 indica que el **37.1%** de las empresas no tienen una estrategia de Big Data. Entre el resto de las empresas basadas en datos, solo un pequeño porcentaje ha logrado cierto éxito. Una de las razones principales se debe a la comprensión mínima de la tecnología de datos o su falta de. Por lo tanto, el software de raspado web es una clave esencial para el establecimiento de una estrategia comercial basada en datos. Puede usar Python, Selenium y PHP para raspar los sitios web. Como beneficio adicional, es genial si eres experto en programación. En este artículo, **discutimos el uso de** [**web scraping tools**](https://www.octoparse.es/) **para facilitar un scraping sin esfuerzo.**
**Probé un software de web scraping y enumeré las notas de la siguiente manera.** Algunas herramientas, como Octoparse, proporcionan plantillas y servicios de scraping que son una gran ventaja para las empresas que carecen de habilidades de scraping de datos, o que son reacias a dedicar tiempo al scraping de la web. Algunas de las herramientas de web scrapig requieren que tenga algunas habilidades de programación para configurar un raspado avanzado, por ejemplo, Apify. Por lo tanto, **realmente depende de lo que desea raspar y de los resultados que desea lograr.** Un herramienta de web scraping es como un cuchillo de cocinero: es importante verificar el estado antes de habilitar un entorno de cocción totalmente equipado.
&#x200B;
https://preview.redd.it/gvtymv0pczy51.png?width=700&format=png&auto=webp&v=enabled&s=0776ea0f37115cbc6b5847b53ddfa5ce912f3180
**Primero**, intente pasar un tiempo para estudiar sitios web específicos. Esto no significa que deba analizar la página web.. Basta con echar un vistazo a las páginas web. Al menos debe saber cuántas páginas necesita scrape.
**En segundo lugar**, preste atención a su estructura HTML. Algunos sitios web no están escritos de manera estándar. Dicho esto, si la estructura HTML está en mal estado y aún necesita raspar el contenido, debe modificar el XPath.
**Tercero**, encuentre la herramienta correcta. Estas son algunas experiencias personales y pensamientos con respecto a las herramientas de scraping. Espero que pueda proporcionarle algunas ideas.
### #1 [Octoparse](https://www.octoparse.es/)
[O](https://www.octoparse.es/)[ctoparse](https://www.octoparse.es/) es un web scraping gratuito y potente con funciones integrales. ¡Es muy generoso que ofrezcan páginas ilimitadas gratis! Octoparse simula el proceso de scraping humano, como resultado, todo el proceso de scraping es súper fácil y fácil de operar. Está bien si no tienes idea de la programación. Puede usar las herramientas Regex y XPath para ayudar a la extracción con precisión. Es común encontrar un sitio web con estructuras de codificación en mal estado a medida que están escritas por personas, y es normal que las personas cometan errores. En este caso, es fácil pasar por alto estos datos irregulares durante la recopilación. XPath puede resolver el 80% de los problemas de datos faltantes, incluso al raspar páginas dinámicas. Sin embargo, no todas las personas pueden escribir el Xpath correcto. Además, Octoparse tiene plantillas integradas que incluyen Amazon, Yelp y TripAdvisor para que las usen los principiantes. Los datos raspados se exportarán a Excel, HTML, CVS y más.
**Pros:** Directrices estándar y tutoriales de Youtube, plantillas de tareas integradas, rastreos ilimitados gratuitos, herramientas Regex y Xpath. Nómbrelo, Octoparse ofrece más que suficientes características sorprendentes.
**Contras:** Desafortunadamente, Octoparse aún no tiene la función de extracción de datos PDF, ni descarga imágenes directamente (solo puede extraer URL de imágenes)
Aprende a crear un web scrapper con Octoparse
### #2 [Mozenda](https://www.mozenda.com/)
Mozenda es un servicio de web scraping basado en la nube. Incluye una consola web y un generador de agentes que le permite ejecutar sus propios agentes, ver y organizar resultados. También le permite exportar o publicar datos extraídos a un proveedor de almacenamiento en la nube como Dropbox, Amazon S3 o Microsoft Azure. Agent Builder es una aplicación de Windows para construir su propio proyecto de datos. La extracción de datos se procesa en servidores de recolección optimizados en los centros de datos de Mozenda. Como resultado, esto aprovecha el recurso local del usuario y evita que sus direcciones IP sean prohibidas.
**Pros:** Mozenda proporciona una barra de acción integral, que es muy fácil de capturar datos AJAX e iFrames. También es compatible con la extracción de documentación y extracción de imágenes. Además de la extracción multiproceso y la agregación inteligente de datos, Mozenda proporciona Geolocation para evitar la prohibición de IP, el modo de prueba y el manejo de errores para corregir errores.
**Contras:** Mozenda es un poco caro, cobra desde $ 99 por 5000 páginas. Además, Mozenda requiere una PC con Windows para ejecutarse y tiene problemas de inestabilidad cuando se trata de sitios web extra grandes.
### #3 [80legs](https://80legs.com/)
80legs es una poderosa herramienta de rastreo web que se puede configurar según los requisitos personalizados. Es interesante que pueda personalizar su aplicación para scrape y rastrear, pero si no es una persona de tecnología, debe tener cuidado. Asegúrese de saber lo que está haciendo en cada paso cuando personalice su raspado. 80legs admite la obtención de grandes cantidades de datos junto con la opción de descargar los datos extraídos al instante. Y es muy bueno que pueda rastrear hasta 10000 URL por ejecución en el plan gratuito.
**Pros:** 80legs hace que la tecnología de web crawling sea más accesible para empresas y personas con un presupuesto limitado.
**Contras:** si desea obtener una gran cantidad de datos, debe establecer un crawl y una API preconstruida. El equipo de soporte es lento.
### #4[ Import.Io](https://www.import.io/)
Import.Io es una plataforma de web scraping que admite la mayoría de los sistemas operativos. Tiene una interfaz fácil de usar que es fácil de dominar sin escribir ningún código. Puede hacer clic y extraer cualquier dato que aparezca en la página web. Los datos se almacenarán en su servicio en la nube durante días. Es una gran opción para la empresa.
**Pros:** Import.io es fácil de usar y admite casi todos los sistemas. Es bastante fácil de usar con su interfaz agradable y limpia, tablero simple, captura de pantalla.
**Contras**: El plan gratuito ya no está disponible. Cada subpágina cuesta crédito. Puede volverse costoso si extrae datos de varias subpáginas. El plan pagado cuesta $299 por mes por 5000 consultas URL o $4,999 por año por medio millón.
### #5 [Content Grabber](https://sequentum.com/)
Como el nombre indica. Content Grabber es una poderosa herramienta de raspado visual de múltiples funciones para la extracción de contenido de la web. Puede recopilar automáticamente estructuras de contenido completas, como catálogos de productos o resultados de búsqueda. Para las personas con grandes habilidades de programación pueden encontrar una forma más efectiva a través de Visual Studio 2013 integrado en Content Grabber. Content Grabber ofrece más opciones para usuarios con muchas herramientas de terceros.
**Pros:** Content Grabber es muy flexible en el manejo de sitios web complejos y extracción de datos. Le ofrece el privilegio de editar la adaptación de raspado a sus necesidades.
**Contras:** el software solo está disponible en sistemas Windows y Li...
jtegqx
u_melisaxinyue
melisaxinyue
t3_jtegqx
https://www.reddit.com/r/u_melisaxinyue/comments/jtegqx/mejores_datos_scraping_herramientas_10_reseñas/
11/13/2020 9:56:30 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Mejores Datos Scraping Herramientas (10 Reseñas Principales)
False
1
jtegqx
0
7400
5
5
10
0.798084596967279
0
0
0
0
696
55.5466879489226
1253
Red
10
Dash Dot Dot
20
No
863
Posted
10/22/2020 6:49:25 AM
El contenido es la forma más básica de atraer tráfico - sin cierta cantidad de contenido de calidad, ni Google ni los visitantes estarían interesados en su sitio web porque hay poco valor que puedan obtener navegándolo.
Aquí hay 2 soluciones principales sin codificación para extraer contenido de sitios web y construir su base de contenido: elija una o las dos y pruébela.
Tableta de Contenidos
#### Extraer contenido desde sitios web usando herramienta de web scraping
#### Extraer contenido desde sitios web usando herramientas de agregación de contenido
#### Conclusión
## Extraer contenido desde sitios web usando herramienta de Web Scraping
El web scraping es el proceso de extraer información de un sitio web sin usar una API para obtener el contenido, pero debe seguir los requisitos de robots.txt del sitio web para evitar actividades no autorizadas.
Estos son algunos de los principales pros y contras del web scraping.
**Pros:**
1. Puede scrapear contenido de tendencias y bien calificado de varias plataformas con una herramienta de raspado web. Esto puede ayudarlo a ahorrar tiempo y dinero para tratar con múltiples agregadores de contenido.
2. Puede recopilar contenido de las reacciones de la audiencia, como megusta, vistas y compartidos si hay. Los datos de contenido y reacción son valiosos para hacer su matriz de contenido.
3. Puede scrapear contenido de los sitios de sus competidores para analizar la competencia y la estrategia de contenido.
4. Puede construir una base de contenido con una gran escala de recursos. Cuando necesite inspiración o referencias, siempre tiene abundantes recursos a su alcance.
**Cons:**
1. Puede que se necesite un procesamiento adicional para los datos extraídos y que tenga que editar manualmente el formato del contenido por su cuenta, esto podría llevar un poco de tiempo.
2. Los sitios de los que extrajo el contenido pueden bloquear su IP. Es posible que pierda el acceso a estos sitios si lo bloquean.
3. La herramienta no te puede automatizar el proceso de distribución de contenido como lo hacen algunas herramientas de agregación de contenido.
Si está buscando una buena herramienta de web scraping, existen tres herramientas populares de web scraping que no puede perderse.
[**Octoparse**](https://www.octoparse.es/)
[Octoparse](https://www.octoparse.es/) es una herramienta potente de web scraping para extraer textos, videos e imágenes de cualquier sitio web. Ofrece plantillas prediseñadas gratuitas para extraer datos de varios sitios web. Eso significa que los usuarios no tienen que configurar un rastreador ellos mismos para extraer la información de sitios web como Amazon, Booking, etc. Solo necesitan elegir una plantilla e ingresar palabras clave o URLs para extraer los campos de datos más extraídos del sitio. Si los usuarios quieren crear un rastreador personalizado, también es fácil de configurar. Simplemente haga clic en la página web para crear uno.
Además, tiene muchas funciones prácticas, como reformateo de datos, programación de tareas, configuración de tareas principales, aceleración de la extracción en la nube, etc. Es una de las herramientas poderosas que puede ayudarlo a extraer contenido de sitios web fácilmente.
**Scraper**
Scraper es una extensión de Chrome con funciones limitadas de extracción de datos en comparación con otros softwares. Pero es útil para los usuarios individuales realizar búsquedas en línea. Puede exportar los datos extraídos a Google Spreadsheets directamente.
Además, esta herramienta está diseñada para principiantes en el rastreo web. Puede copiar fácilmente los datos al clipboard o almacenarlos en las hojas de cálculo usando OAuth. La generación automática de XPath es una de las excelentes características que tiene para los principiantes. Si desea datos más precisos, es inevitable que vuelva a escribir el XPath usted mismo.
[**ParseHub**](https://www.parsehub.com/)
Parsehub es un gran raspador web que admite la recopilación de datos de sitios web creados con tecnología AJAX, JavaScript, etc. Es poco probable que ocurran problemas de incompatibilidad web cuando lo usa. Además, tiene una tecnología avanzada de aprendizaje automático que puede ayudarlo a transformar documentos web en datos.
Parsehub es compatible con todos los sistemas operativos populares, como Windows, Mac OS y Linux. No tiene que preocuparse por los usos de multiplataforma. La versión gratuita puede configurar cinco proyectos públicos como máximo. Los planes de suscripción de pago más baratos le permiten crear al menos 20 proyectos privados para scrapear sitios web. Es muy conveniente para usuarios individuales y pequeñas empresas.
##
https://preview.redd.it/fh318j8aflu51.png?width=1200&format=png&auto=webp&v=enabled&s=a2fd04c659a7bf2bb4f49b772b2888a0bdcadfb6
## Extraer contenido desde sitios web usando herramientas de agregación de contenido
Una herramienta de agregación de contenido es una aplicación o sitio web que puede ayudarlo a recopilar contenido de una amplia gama de plataformas y luego volver a publicar todo el contenido en un solo lugar. Hay muchos tipos de herramientas de agregación de contenido que se especializan en recopilar diferentes tipos de contenido (noticias deportivas, noticias financieras y noticias de juegos, etc.) o formatos de contenido (video, blogs, podcasts, imágenes, etc.).
Existen algunas ventajas y desventajas importantes de las herramientas de agregación de contenido que debe conocer antes de tomar la decisión.
**Pros**:
1. Algunas herramientas de agregación de contenido pueden personalizar el contenido para usted. Generalmente, esto ayuda a su audiencia a conectarse mejor con su sitio. Y les ayuda a saber que su sitio es el adecuado para ellos.
2. Algunos agregadores de contenido son maestros en la distribución de contenido. Saben muy bien cómo maximizar el alcance del contenido a su audiencia potencial, ayudándole así a atraer más tráfico a sus sitios.
3. Puede dejar la distribución manual de contenido a una herramienta de agregación de contenido, liberándolo así del trabajo manual y tedioso, ayudándole a concentrarse en el trabajo valioso.
4. Una de las mejores cosas de usar agregadores de contenido es que pueden ayudarlo a construir vínculos de retroceso para su sitio y así mejorar su rendimiento de SEO.
**Cons**:
1. Cuando su audiencia lee contenido agregado de otros sitios, puede suscribirse a los sitios originales y dejar su sitio.
2. El uso de agregadores de contenido en su sitio puede aumentar la popularidad de los propietarios del contenido original, no de usted.
3. Sin crear contenido original, puede perder la oportunidad de comprender mejor a sus audiencias y no tendría una comunicación directa con sus audiencias. Esto explica las pérdidas oportunidades de conversión.
4. La función principal de un agregador de contenido es recopilar una gran cantidad de contenido. Por lo tanto, la herramienta en sí no puede ayudar a filtrar el contenido ni garantizar su confiabilidad. Su sitio puede verse afectado por noticias falsas.
**Trapit**
Trapit es una herramienta integral de agregación de contenido para empresas que ofrecen diversos temas de contenido. Puede extraer fuentes de texto y video de una amplia gama de sitios web. Además, también ofrece analíticas integradas y herramientas de programación social. Si desea agregar información, investigación y tendencias de la industria para su público en su sitio web o en las plataformas de redes sociales. Es una de las grandes herramientas que no puede perderse.
**BuzzSumo**
BuzzSumo es una poderosa herramienta de agregación de contenido en línea que lo mantiene actualizado sobre todos los temas de tendencia en la industria o le permite encontrar contenido popular en cualquier sitio web. Puede buscar el tema que le interesa y compartirlo a través del panel de control. Además, la sección "Investigación de contenido" le permite interactuar con personas que comparten el contenido.
Buzzsumo es una herramienta que puede ayudarlo a enfocar el punto y orientarlo.
**Elink.Io**
Elink.io es la forma más rápida de recopilar y compartir contenido de web sobre cualquier tema de varios sitio...
jfuoqe
webscraping
melisaxinyue
t3_jfuoqe
https://www.reddit.com/r/webscraping/comments/jfuoqe/2_formas_de_extraer_contenido_de_sitios_web_sin/
10/22/2020 6:49:25 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
2 Formas de Extraer Contenido de Sitios Web Sin Codificación para Aumentar el Tráfico Web
False
0.5
jfuoqe
0
7400
5
5
7
0.562248995983936
3
0.240963855421687
0
0
662
53.1726907630522
1245
Red
10
Dash Dot Dot
20
No
862
Posted
6/18/2021 10:13:05 AM
Puedes llevar mucho tiempo buscar, copiar y pegar varias imágenes de Reddit. Pero, ¿alguna vez has pensado en crear un raspador de imágenes de Reddit usando Octoparse, la poderosa herramienta de web scraping? Ahora vamos a averiguar cómo hacerlo.
**Tabla de Contenido**
[¿Qué es Image Scraper y cómo funciona?](https://www.octoparse.es/blog/crear-un-image-scraper-de-reddit-sin-codificacion#h1)
[Cómo construir un Image Scraper?](https://www.octoparse.es/blog/crear-un-image-scraper-de-reddit-sin-codificacion#h2)
[Ejemplo de tutorial: Crear un Image Scraper de Reddit con Octoparse](https://www.octoparse.es/blog/crear-un-image-scraper-de-reddit-sin-codificacion#h3)
Este artículo presentará particularmente cómo construir un image Scraper de Reddit, pero comencemos con la idea del image Scraper. Image Scraper reduce su trabajo manual de copiar y pegar imágenes de las páginas web.
Para conocer más detalles, haz click aquí: [https://www.octoparse.es/blog/crear-un-image-scraper-de-reddit-sin-codificacion](https://www.octoparse.es/blog/crear-un-image-scraper-de-reddit-sin-codificacion)
https://preview.redd.it/bbo2tm3910671.png?width=1600&format=png&auto=webp&v=enabled&s=f52c0db38ca737dbf8e6bc911870fb74d856aa02
##
o2lqrr
u_melisaxinyue
melisaxinyue
t3_o2lqrr
https://www.reddit.com/r/u_melisaxinyue/comments/o2lqrr/crear_un_image_scraper_de_reddit_sin_codificación/
6/18/2021 10:13:05 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Crear un Image Scraper de Reddit sin codificación
False
1
o2lqrr
0
7400
5
5
0
0
0
0
0
0
106
60.9195402298851
174
Red
10
Dash Dot Dot
20
No
861
Posted
6/30/2021 9:12:55 AM
Los avances tecnológicos han tomado al mundo por asalto: todo lo que alguna vez fue parte de nuestra imaginación ahora es una realidad. Internet está equipado con todo lo que uno pueda necesitar, desde la afluencia de información y datos hasta videos e imágenes. Sin embargo, como la cantidad de datos disponibles en línea es magnánima; por lo tanto, extraer y descargar estos datos puede ser un proceso tedioso. Las empresas necesitan datos en términos de información, números, imágenes, etc., casi a diario.
https://preview.redd.it/06mfsjxodd871.png?width=2000&format=png&auto=webp&v=enabled&s=6eef1fcdb9dcd3a740348c749db37148314c2b51
Los elementos visuales en términos de imágenes han ganado popularidad en este mundo impulsado por la tecnología; tiende a elevar el aspecto general y la estética de cualquier cosa al instante. Somos plenamente conscientes de que las numerosas [herramientas de extracción de datos](https://www.octoparse.es/) disponibles hacen que el trabajo sea mucho más fácil, económico y rápido para los grandes y pequeños negocios. Sin embargo, la pregunta en cuestión es ¿si existe una herramienta, software o algún método en el que el tedioso proceso de [descargar imágenes de la lista de URL](https://www.octoparse.es/faq/extract-and-download-images-from-websites-using-octoparse) también se pueda hacer más accesible, económico y rápido? Bueno, aprovechemos esta oportunidad para decirte todo, sin duda hay una forma en la que puedes descargar una gran cantidad de imágenes fácilmente desde la lista de URL. ¡Sí, lo lees bien!El proceso es más o menos similar al método de [extracción de datos](https://www.octoparse.es/DataExtraction) con ligeros cambios aquí y allá. Así que vamos a sumergirnos y descubrir cómo hacer esto, sigue leyendo.
**Tabla de contenido**
[¿Qué necesitas para descargar imágenes de la lista de URL?](https://www.octoparse.es/blog/como-descargar-im%C3%A1genes-de-la-lista-de-url#h1)
[¿Cómo utilizar Octoparse para extraer las URL de las imágenes seleccionadas?](https://www.octoparse.es/blog/como-descargar-im%C3%A1genes-de-la-lista-de-url#h2)
## ¿Qué necesitas para descargar imágenes de la lista de URL?
Para ejecutar el proceso de descarga de imágenes desde la URL, hay dos cosas que necesitas. Primero, necesitas una [herramienta de raspado web](https://www.octoparse.es/); Sugerimos nuestro [Octoparse](https://www.octoparse.es/) favorito, ya que es una herramienta de raspado web visual sin codificación. En segundo lugar, [TabSave](https://chrome.google.com/webstore/detail/tab-save/lkngoeaeclaebmpkgapchgjdbaekacki?hl=en), un complemento de Chrome, te ayuda a guardar las imágenes inmediatamente cuando proporcionas la lista de URL.
Lo mejor sería recordar que no todas las imágenes se crean por igual, lo que significa que algunas de ellas se pueden obtener directamente de la página web. Por el contrario, algunas otras imágenes se pueden descargar solo haciendo clic en las miniaturas respectivas.
## ¿Cómo utilizar Octoparse para extraer las URL de las imágenes seleccionadas?
Primero, averigüemos cómo obtener una imagen directamente desde una página web. Por ejemplo, si deseas extraer imágenes de un atardecer de [Pexel.com](http://pexel.com/). Accederás al sitio web y escribirás "atardecer" en la barra de búsqueda de pexels.com, que abrirá la página que muestra varias imágenes de atardecer. Ahora lo harías:
1. Haz clic en "+ Tarea" para iniciar una nueva tarea en el [Modo Avanzado](https://helpcenter.octoparse.com/hc/en-us/articles/360018281431-Advanced-Mode?__hstc=97730752.26fa06f9315bc9580ac0f31c1ad9d064.1624329979919.1625015647175.1625033247463.28&__hssc=97730752.211.1625033247463&__hsfp=2231130639).
2. Inserta la URL de la página web seleccionada en el cuadro de texto.
3. Haz clic en "Guardar URL".
La primera parte del proceso está terminada, ahora llegarás a otra página. Necesitamos decirle al bot qué imágenes necesita buscar. Entonces,
1. Haz clic en la primera imagen. El "Consejo de Acción" ahora leerás, "Imagen seleccionada, 100 imágenes similares encontradas" - esto significa que estamos en el camino correcto.
2. Ve a Seleccionar y elige "Seleccionar todo".
3. A continuación, "Extraer la URL de la imagen en el bucle".
Como queremos las imágenes de varias páginas y no solo de una página en particular, para obtener las imágenes de todas las páginas – desplazarse hacia abajo hasta la parte inferior de la página actual y hacer clic en "página siguiente". Para extraer las imágenes de varios operadores, es natural que tengamos que hacer clic en la "página siguiente" varias veces, pero podemos seleccionar "[Hacer clic en bucle en el enlace seleccionado](https://www.octoparse.es/faq/pagination-scraping-loop-click-item)" de "Sugerencias de acción".
Antes de ejecutar tu raspador / rastreador web, debes asegurarte de una última cosa: si el código fuente HTML se actualiza cuando se desplaza hacia abajo o si la página web no se desplaza completamente hacia abajo, las URL de imagen correspondientes no se descargarán. Esta es una de las razones principales por las que nos inclinamos por Octoparse, ya que se desplaza rápidamente de forma automática. Asegúrate de agregar el desplazamiento automático cuando accedes al sitio web por primera vez y luego nuevamente cuando se pagina. Para hacer esto, necesitas:
1. "Ir a la página web" desde el flujo de trabajo. Hay "Opciones avanzadas" en el lado derecho del flujo de trabajo.
2. Verifica "Desplazarse hacia abajo hasta la parte inferior de la página cuando termine de cargar.
Incluso puedes personalizar la cantidad de veces que deseas desplazarse y cuál debería ser su ritmo. Octoparse te permite desplazarse hacia abajo en una pantalla singular 40 veces en un segundo entre cada desplazamiento. Comprueba la configuración que mejor se adapte a tus necesidades; es posible que debas modificarlo en consecuencia. Una vez que estés satisfecho con la configuración, aplícala también al paso de paginación. Haz clic en "[Hacer clic para paginar](https://www.octoparse.es/tutorial-7/extract-multiple-pages-through-pagination)" en el flujo de trabajo y luego usa la misma configuración que un desplazamiento automático.
¡Y ya está! Ahora, todo lo que necesitas hacer es verificar y ejecutar el rastreador para asegurarse de que funcione correctamente. Para hacerlo, simplemente haz clic en "Iniciar Extracción" en la esquina superior izquierda de la pantalla. Selecciona "[Extracción local](https://www.octoparse.es/tutorial-7/local-extraction#:~:text=of%20Local%20Extraction-,Run%20tasks%20on%20Local%20Extraction,to%20run%20the%20task%20locally.)", lo que significa que se ejecuta el rastreador en tu sistema y no en el servidor en la nube. ¡Eso es!
Ahora, el método para raspar una imagen de tamaño completo es ligeramente diferente. Usaremos el mismo ejemplo de descarga de imágenes de atardecer de pexels.com para decirte cómo descargar una imagen a tamaño completo.
1. Inicia una nueva tarea y haz clic en "+ Tarea" en "Modo Avanzado".
2. Inserta la URL de la página web seleccionada en el cuadro de texto, luego haz clic en "Guardar URL" para continuar.
3. Individualmente, haz clic en la imagen para obtener la imagen a tamaño completo.
4. Después de hacer clic en la primera imagen, la sugerencia de acción debe decir "Imagen seleccionada, 100 imágenes similares encontradas" y haz clic en "Seleccionar Todo".
5. Ahora, selecciona "Hacer clic en bucle en cada imagen", esto lo llevarás a la página que tiene todas las imágenes de tamaño completo ".
Simplemente, haz clic en la imagen de tamaño completo y selecciona "Extraer URL de la imagen seleccionada", y haz clic en "Ir a la página web", elige el botón "Página siguiente", y luego selecciona "El bucle hizo clic en el enlace seleccionado" en "Acción Consejos".
¿Adivina qué? ¡Estás listo! Prueba el rastreador y comprueba si funciona perfectamente.
oath9e
u_melisaxinyue
melisaxinyue
t3_oath9e
https://www.reddit.com/r/u_melisaxinyue/comments/oath9e/cómo_descargar_imágenes_de_la_lista_de_url/
6/30/2021 9:12:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Cómo descargar imágenes de la lista de URL
False
1
oath9e
0
7400
5
5
2
0.162074554294976
2
0.162074554294976
0
0
673
54.5380875202593
1234
Red
10
Dash Dot Dot
20
No
860
Posted
6/9/2020 8:46:58 AM
[**¿Qué es el web scraping?**](https://www.octoparse.com/blog/make-web-scraping-easy)
El [web scraping](https://www.octoparse.es/), también conocido como [web harvesting ](https://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping)y [extracción de datos web](http://www.dataextraction.io/), se refiere básicamente a la recopilación de datos de sitios web a través del Hypertext Transfer Protocol (HTTP) o mediante navegadores web.
Tabla de contenidos
* [¿Qué es el web scraping?](https://www.octoparse.es/blog/como-comenzo-y-sucedera-en-futuro#div1)
* [¿Cómo funciona el web scraping?](https://www.octoparse.es/blog/como-comenzo-y-sucedera-en-futuro#div2)
* [¿Cómo comenzó todo?](https://www.octoparse.es/blog/como-comenzo-y-sucedera-en-futuro#div3)
* [Cómo se hace el web scraping](https://www.octoparse.es/blog/como-comenzo-y-sucedera-en-futuro#div4)[?](https://www.octoparse.com/blog/web-scraping-introduction#How%20will%20web%C2%A0scraping%20be?)
* [¿Cómo será el web scraping?](https://www.octoparse.es/blog/como-comenzo-y-sucedera-en-futuro#div5)
**¿Cómo funciona el web scraping?**
En general, el web scraping implica tres pasos:
* primero, enviamos una solicitud GET al servidor y recibiremos una respuesta en forma de contenido web.
* A continuación, analizamos el código HTML de un sitio web siguiendo una ruta de estructura de árbol.
* Finalmente, usamos la python library para buscar el parse tree.
https://preview.redd.it/mo1ex5nfku351.png?width=666&format=png&auto=webp&v=enabled&s=525de1519fadf712978a68336cc5dc7ab8331b2a
**¿Cómo comenzó todo?**
Aunque para muchas personas, suena como una técnica tan fresca como conceptos como "Big Data" o "machine learning", la historia del web scraping es en realidad mucho más larga. Se remonta a la época en que nació la World Wide Web, o coloquialmente "Internet"
Al principio, Internet era incluso inescrutable. Antes de que se desarrollaran los motores de búsqueda, Internet era solo una colección de sitios de File Transfer Protocol (FTP) en los que los usuarios navegaban para encontrar archivos compartidos específicos. Para encontrar y organizar los datos distribuidos disponibles en Internet, las personas crearon un programa automatizado específico, conocido hoy como el **web crawler/bot**, para **buscar todas las páginas** en Internet y luego **copiar todo el contenido** en las bases de datos para su indexación.
Luego, Internet crece y se convierte en el hogar de millones de páginas web que contienen una gran cantidad de datos en múltiples formas, incluidos textos, imágenes, videos y audios. Se convierte en una fuente de datos abierta.
A medida que la fuente de datos se hizo increíblemente rica y fácil de buscar, la gente comienzan a descubrir que la información requerida se puede encontrar fácilmente. Esta información generalmente se encuentra dispersa en muchos sitios web, pero el problema es que cuando desean obtener datos de Internet, no todos los sitios web ofrecen la opción de descargar datos. Copiar y pegar es muy engorroso e ineficiente.
Y ahí es donde entró el web scraping. El web scraping en realidad está impulsado por web bots/crawlers, y sus funciones son las mismas que las utilizadas en los motores de búsqueda. Es decir, **buscar y copiar**. La única diferencia podría ser la escala. El web scraping se centra en extraer solo datos específicos de ciertos sitios web, mientras que los motores de búsqueda a menudo obtienen la mayoría de los sitios web en Internet.
### - ¿Cómo se hace el web scraping?
* 1989 El nacimiento de la World Wide Web
Técnicamente, la World Wide Web es diferente de Internet. El primero se refiere al **espacio** de información, mientras que el segundo es la **network** compuesta por computadoras.
Gracias a Tim Berners-Lee, el inventor de WWW, trajo las siguientes 3 cosas que han sido parte de nuestra vida diaria:
* Localizadores Uniformes de Recursos (URL) que utilizamos para ir al sitio web que queremos;
* embedded hyperlinks que nos permiten navegar entre las páginas web, como las páginas de detalles del producto en las que podemos encontrar especificaciones del producto y muchas otras cosas como "los clientes que compraron esto también compraron";
* páginas web que contienen no solo textos, sino también imágenes, audios, videos y componentes de software.
* 1990 El primer navegador web
También inventado por Tim Berners-Lee, se llamaba WorldWideWeb (sin espacios), llamado así por el proyecto WWW. Un año después de la aparición de la web, las personas tenían una forma de verla e interactuar con ella.
* 1991 El primer servidor web http:// web page
La web siguió creciendo a una velocidad bastante moderada. Para 1994, el número de servidores HTTP era superior a 200.
* 1993-Junio Primer robot web - World Wide Web Wanderer
Aunque funcionó de la misma manera que lo hacen los robots web hoy en día, solo tenía la intención de medir el tamaño de la web.
* 1993-Diciemble Primer motor de búsqueda crawler-based web JumpStation
Como no había tantos sitios web disponibles en la web, los motores de búsqueda en ese momento solían depender de los administradores de sus sitios web humanos para recopilar y editar los enlaces en un formato particular.
JumpStation trajo un nuevo salto. Es el primer motor de búsqueda WWW que se basa en un robot web.
Desde entonces, la gente comenzó a usar estos web crawlers programáticos para recolectar y organizar Internet. Desde Infoseek, Altavista y Excite, hasta Bing y Google hoy, el núcleo de un robot de motor de búsqueda sigue siendo el mismo:
Como las páginas web están diseñadas para usuarios humanos, y no para la facilidad de uso automatizado, incluso con el desarrollo del bot web, todavía fue difícil para los ingenieros informáticos y los científicos hacer scraping web, y mucho menos personas normales. Por lo tanto, la gente se ha dedicado a hacer que el web scraping esté más disponible.
* 2000 Web API y API crawler
API significa **Interfaz de Programación de Aplicaciones**. Es una interfaz que facilita mucho el desarrollo de un programa al proporcionar los bloques de construcción.
En 2000, Salesforce y eBay lanzaron su propia API, con la cual los programadores pudieron acceder y descargar algunos de los datos disponibles al público.
Con comandos simples, Beautiful Soup tiene sentido de la estructura del sitio y ayuda a analizar el contenido desde el contenedor HTML. Se considera la biblioteca más sofisticada y avanzada para el raspado web, y también uno de los enfoques más comunes y populares en la actualidad.
* 2005-2006 Visual web scraping software
En 2006, Stefan Andresen y su Kapow Software (adquirido por Kofax en 2013) lanzaron la Web Integration Platform version 6.0, algo que ahora se entiende como software **visual de web scraping**, que permite a los usuarios simplemente resaltar el contenido de una página web y estructurar esos datos en un excel file utilizable o database
Finalmente, hay una manera para que los masivos no programadores hagan web scraping por su cuenta.
Desde entonces, el web scraping está comenzando a llegar a la corriente principal. Ahora, para los no programadores, pueden encontrar fácilmente más de [80 programas de extracción de datos listos](https://www.capterra.com/data-extraction-software/) para usar que proporcionan procesos visuales.
**¿Cómo será el web scraping?**
Las crecientes demandas de datos web por parte de las empresas en toda la industria prosperan en el mercado de web scraping, y eso trae nuevos empleos y oportunidades comerciales.
Es una época que es más fácil que cualquier otra que hayamos tenido en la historia. Cualquier persona, empresa u organización puede obtener los datos que desee, siempre que estén disponibles en la web. Gracias al web crawler/bot, API, bibliotecas estándar y varios softwares listos para usar, una vez que alguien tiene la voluntad de obtener datos, hay una manera para ellos. O también pueden recurrir a profesionales accesibles y asequibles.
**haya Internet, habrá web scraping.**
Una forma de evitar las posibles consecuencias legales del web scraping es consultar a los proveedores profesionales de servicios d...
gzj916
webscraping
melisaxinyue
t3_gzj916
https://www.reddit.com/r/webscraping/comments/gzj916/web_scraping_cómo_comenzó_y_qué_sucederá_en_el/
6/9/2020 8:46:58 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping: Cómo Comenzó y Qué Sucederá en El Futuro
False
0.4
gzj916
0
7400
5
5
5
0.386996904024768
0
0
0
0
723
55.9597523219814
1292
Red
10
Dash Dot Dot
20
No
859
Posted
11/11/2021 7:40:24 AM
https://www.octoparse.es/2021-black-friday-sale?
qrfz99
u_melisaxinyue
melisaxinyue
t3_qrfz99
https://www.reddit.com/r/u_melisaxinyue/comments/qrfz99/black_friday_sale_de_octoparse/
11/11/2021 7:40:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Black Friday Sale de Octoparse
False
1
qrfz99
0
7400
5
5
Red
10
Dash Dot Dot
20
No
858
Posted
9/30/2021 9:26:39 AM
https://i.redd.it/g97rjk110mq71.jpg
pyg7l9
visualization
melisaxinyue
t3_pyg7l9
https://www.reddit.com/r/visualization/comments/pyg7l9/httpsoctoparseesbloglas9mejoresherramientasdevisua/
9/30/2021 9:26:39 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
https://octoparse.es/blog/las-9-mejores-herramientas-de-visualizacion-de-datos-para-no-desarrolladores
False
0.27
pyg7l9
0
7400
5
5
Red
10
Dash Dot Dot
20
No
857
Posted
8/30/2021 8:28:54 AM
**Herramienta Web Scraping** (también conocido como extracción de datos de la web, web crawling) se ha aplicado ampliamente en muchos campos hoy en día. Antes de que una herramienta de scraping llegue al público, es la palabra mágica para personas normales sin habilidades de programación. Su alto umbral sigue bloqueando a las personas fuera de Big Data. **Una herramienta de web scraping es la tecnología de captura automatizada y cierra la brecha entre Big Data y cada persona.**
Enumeré **20 MEJORES web scrapers** incluyen sus caracterísiticas y público objetivo para que tomes como referencia. ¡Bienvenido a aprovecharlo al máximo!
**Tabla de Contenidos**
**¿Cuáles son los beneficios de usar técnicas de web scraping?**
**20 MEJORES web scrapers**
* [**Octoparse**](http://octoparse.es/)
* [**Cyotek WebCopy**](https://www.cyotek.com/cyotek-webcopy)
* [**HTTrack**](https://www.httrack.com/)
* [**Getleft**](https://sourceforge.net/projects/getleftdown/)
* [**Scraper**](https://chrome.google.com/webstore/detail/scraper/mbigbapnjcgaffohmbkdlecaccepngjd)
* [**OutWit Hub**](https://addons.mozilla.org/en-US/firefox/addon/outwit-hub/)
* [**ParseHub**](https://www.parsehub.com/)
* [**Visual Scraper**](http://visualscraper.blogspot.hk/)
* [**Scrapinghub**](https://scrapinghub.com/)
* [**Dexi.io**](https://dexi.io/)
* [**Webhose.io**](https://webhose.io/)
* [**Import. io**](https://www.import.io/)
* [**80legs**](http://80legs.com/)
* [**Spinn3r**](https://www.spinn3r.com/)
* [**Content Grabber**](https://contentgrabber.com/)
* [**Helium Scraper**](http://www.heliumscraper.com/en/index.php?p=home)
* [**UiPath**](http://www.uipath.com/)
* [**Scrape.it**](https://www.npmjs.com/package/scrape-it)
* [**WebHarvy**](https://www.webharvy.com/)
* [**ProWebScraper**](https://prowebscraper.com/)
**Conclusión**
**¿Cuáles son los beneficios de usar** **técnicas de web scraping****?**
* Liberar tus manos de hacer trabajos repetitivos de copiar y pegar.
* Colocar los datos extraídos en un formato bien estructurado que incluye, entre otros, Excel, HTML y CSV.
* Ahorrarte tiempo y dinero al obtener un analista de datos profesional.
* Es la cura para comercializador, vendedores, periodistas, YouTubers, investigadores y muchos otros que carecen de habilidades técnicas.
**1.** **Octoparse**
**Octoparse** es un web scraper para extraer casi todo tipo de datos que necesitas en los sitios web. Puedes usar Octoparse para extraer datos de la web con sus amplias funcionalidades y capacidades. Tiene dos tipos de modo de operación: [**Modo Plantilla de tarea**](https://helpcenter.octoparse.es/hc/es/articles/360039675314-Empieza-a-usar-Easy-Template-una-soluci%C3%B3n-de-web-scraping-para-principiantes) y **Modo** **Avanzado**, para que los que no son programadores puedan aprender rápidamente. La interfaz fácil de apuntar y hacer clic puede guiarte a través de todo el proceso de extracción. Como resultado, puedes extraer fácilmente el contenido del sitio web y guardarlo en formatos estructurados como EXCEL, TXT, HTML o sus bases de datos en un corto período de tiempo.
Además, proporciona una **Programada Cloud Extracción** que tle permite extraer datos dinámicos en tiempo real y mantener un registro de seguimiento de las actualizaciones del sitio web.
También puedes extraer la web complejos con estructuras difíciles mediante el uso de su configuración incorporada de Regex y XPath para localizar elementos con precisión. Ya no tienes que preocuparte por el bloqueo de IP. Octoparse ofrece Servidores Proxy IP que automatizarán las IP y se irán sin ser detectados por sitios web agresivos.
Octoparse debería poder satisfacer las necesidades de rastreo de los usuarios, tanto básicas como avanzadas, sin ninguna habilidad de codificación.
&#x200B;
**2.** **Cyotek WebCopy**
WebCopy es un web crawler gratuito que te permite copiar sitios parciales o completos localmente web en tu disco duro para referencia sin conexión.
Puedes cambiar su configuración para decirle al bot cómo deseas capturar. Además de eso, también puedes **configurar alias de dominio, cadenas de agente de usuario, documentos predeterminados** y más.
Sin embargo, WebCopy no incluye un DOM virtual ni ninguna forma de análisis de JavaScript. Si un sitio web hace un uso intensivo de JavaScript para operar, es más probable que WebCopy no pueda hacer una copia verdadera. Es probable que no maneje correctamente los diseños dinámicos del sitio web debido al uso intensivo de JavaScript
&#x200B;
**3.** **HTTrack**
Como programa gratuito de rastreo de sitios web, HTTrack **proporciona funciones muy adecuadas para descargar un sitio web completo a su PC**. Tiene versiones disponibles para Windows, Linux, Sun Solaris y otros sistemas Unix, que cubren a la mayoría de los usuarios. Es interesante que HTTrack pueda reflejar un sitio, o más de un sitio juntos (con enlaces compartidos). Puedes decidir la cantidad de conexiones que se abrirán simultáneamente mientras descarga las páginas web en "establecer opciones". Puedes obtener las fotos, los archivos, el código HTML de su sitio web duplicado y reanudar las descargas interrumpidas.
Además, el soporte de proxy está disponible dentro de **HTTrack para maximizar la velocidad.**
HTTrack funciona como un programa de línea de comandos, o para uso privado (captura) o profesional (espejo web en línea). Dicho esto, HTTrack debería ser preferido por personas con **habilidades avanzadas de programación**.
&#x200B;
**4**. **Getleft**
Getleft es un web spider gratuito y fácil de usar. Te permite **descargar un sitio web completo** o cualquier página web individual. Después de iniciar Getleft, puedes ingresar una URL y elegir los archivos que deseas descargar antes de que comience. Mientras avanza, cambia todos los enlaces para la navegación local. Además, ofrece soporte multilingüe. ¡Ahora Getleft admite 14 idiomas! Sin embargo, solo proporciona compatibilidad limitada con Ftp, descargará los archivos pero no de forma recursiva.
En general, Getleft debería poder satisfacer **las necesidades básicas de scraping** de los usuarios **sin requerir habilidades más sofisticadas**.
&#x200B;
**5**. **Scraper**
Scraper es una extensión de Chrome con funciones de extracción de datos limitadas, pero es útil para realizar investigaciones en línea. También permite **exportar los datos a las hojas de cálculo de Google**. Puedes copiar fácilmente los datos al portapapeles o almacenarlos en las hojas de cálculo con OAuth. Scraper puede generar XPaths automáticamente para definir URL para scraping.
No ofrece servicios de scraping todo incluido, pero puede satisfacer las necesidades de extracción de datos de la mayoría de las personas.
&#x200B;
**6**. **OutWit Hub**
OutWit Hub es un complemento de Firefox con docenas de funciones de extracción de datos para simplificar sus búsquedas en la web. Esta herramienta de web scraping puede navegar por las páginas y almacenar la información extraída en un formato adecuado.
OutWit Hub ofrece **una interfaz única para extraer pequeñas o grandes cantidades de datos por necesidad**. OutWit Hub te permite eliminar cualquier página web del navegador. Incluso puedes crear agentes automáticos para extraer datos.
Es una de las herramientas de web scraping más simples, de uso gratuito y te ofrece la comodidad de extraer datos web sin escribir código.
&#x200B;
**7.** **ParseHub**
Parsehub es un excelente web scraper que admite la recopilación de datos de la web que utilizan tecnología **AJAX, JavaScript, cookies**, etc. Sutecnología de aprendizaje automático puede leer, analizar y luego transformar documentos web en datos relevantes.
La aplicación de escritorio de Parsehub es compatible con sistemas como Windows, Mac OS X y Linux. Incluso puedes usar la aplicación web que está incorporado en el navegador.
Como programa gratuito, no puedes configurar más de cinco proyectos públicos en Parsehub. Los planes de suscripción pagados te permiten crear al menos 20 proyectos privados para scrape sitios web.
ParseHub está dirigido a prácticamente cualquier persona que desee jugar con los datos. Puede ser cualquier persona, desde anali...
pedxns
u_melisaxinyue
melisaxinyue
t3_pedxns
https://www.reddit.com/r/u_melisaxinyue/comments/pedxns/las_20_mejores_herramientas_de_web_scraping_para/
8/30/2021 8:28:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Las 20 Mejores Herramientas de Web Scraping para 2021
False
1
pedxns
0
7400
5
5
Red
10
Dash Dot Dot
20
No
856
Posted
11/13/2020 10:25:17 AM
El mundo digital en el que vivimos está creando constantemente un flujo de datos cada vez mayor. La utilización de big data dinámico se ha convertido en la clave del análisis de datos para las empresas.
En este artículo, responderemos las siguientes preguntas:
**- ¿Por qué es importante capturar datos dinámicos?**
**- ¿Cómo los datos dinámicos promueven el crecimiento empresarial de manera efectiva?**
**- Y el más importante, ¿cómo podemos acceder fácilmente a los datos dinámicos?**
&#x200B;
https://preview.redd.it/7dm5016shzy51.png?width=829&format=png&auto=webp&v=enabled&s=70a12fecf953eba70213a4742d1ef1dced235a9e
**¿Por qué es tan importante capturar datos dinámicos?**
En términos generales, puede ver mejor y actuar más rápido al monitorear constantemente los flujos de datos dinámicos. Para decirlo más específicamente, obtener datos dinámicos puede ayudar:
**Tome decisiones basadas en datos más rápido**
La captura de datos dinámicos lo equipa con información en tiempo real sobre las nuevas tendencias en el mercado y sus competidores. Con toda la información actualizada a mano, puede reducir en gran medida la diferencia horaria entre actualizaciones de productos. En otras palabras, puede obtener información basada en datos y tomar decisiones basadas en datos de manera más rápida y fácil.
Así como Jeff Bezos, CEO de Amazon, lo expresó en una carta a los accionistas, "La velocidad es importante en los negocios". La "toma de decisiones a alta velocidad" es de gran importancia para el desarrollo empresarial.
**Construye una base de datos más poderosa**
A medida que el volumen de datos sigue creciendo en el mundo digital actual, el valor vinculado a cada pieza de datos ha disminuido drásticamente. Para mejorar la calidad de la analítica de datos y la validez de las decisiones, las empresas necesitan construir una base de datos integral y de alto volumen mediante la extracción continua de datos dinámicos.
Los datos son un activo urgente. Cuanto más antigua es la información, más difícil es recopilarla. A medida que el número de información se duplica cada año en tamaño y velocidad, se vuelve sin precedentes crucial realizar un seguimiento de los datos cambiantes para su posterior análisis.
En términos generales, la recopilación de datos a corto plazo ayuda a detectar problemas recientes y a tomar pequeñas decisiones, mientras que la recopilación de datos a largo plazo puede ayudarlo a identificar tendencias y patrones para establecer objetivos comerciales a largo plazo.
**Establecer un sistema adaptativo y analítico.**
El objetivo final del análisis de datos es construir un sistema analítico de datos adaptativo y autónomo, para analizar los problemas de forma continua. No hay duda de que un sistema analítico adaptativo se basa en la recopilación automática de datos dinámicos. En este caso, ahorra tiempo en la construcción de modelos analíticos cada vez y elimina los factores humanos en el ciclo. La conducción autónoma de automóviles es un gran ejemplo de una solución analítica adaptativa.
**¿Cómo los datos dinámicos promueven el crecimiento empresarial de manera efectiva?**
Podemos aplicar análisis de datos dinámicos en muchos aspectos, para facilitar el desarrollo empresarial, que incluyen:
**Monitoreo del producto**
La información del producto, como los precios, la descripción, las opiniones de los clientes, la imagen, están disponibles en los mercados en línea y se actualizan periódicamente. Por ejemplo, una investigación de mercado previa al lanzamiento se puede [realizar fácilmente recuperando información del producto en Amazon](https://www.octoparse.es/tutorial-7/scrape-product-information-from-amazon) o [scraping los precios de eBay](https://www.octoparse.es/tutorial-7/scrape-pricing-from-ebay).
La extracción de información dinámica también le permite evaluar la posición competitiva de los productos y desarrollar estrategias para fijar precios y almacenar de manera efectiva. Puede ser un método confiable y efectivo para monitorear las acciones de la competencia en el mercado.
**Gestión de la Experiencia del Cliente**
Las empresas están cada vez más atentas a la gestión de la experiencia del cliente que nunca. Según la definición de Gartner, es "la práctica de diseñar y reaccionar a las interacciones del cliente para cumplir o superar las expectativas del cliente, por lo tanto, aumentar la satisfacción del cliente, la lealtad y la defensa".
Por ejemplo, [extraer todas las reseñas de un producto en Amazon ](https://www.octoparse.es/tutorial/amazon-scraping-case-study-scrape-amazon-product-reviews-and-ratings/)puede ayudar a descifrar cómo se sienten los clientes sobre el producto mediante el análisis de los comentarios positivos y negativos. Esto es útil para comprender las necesidades de los clientes, así como para conocer el nivel de satisfacción de un cliente en tiempo real.
**Estrategias de marketing**
El análisis dinámico de datos le permite saber qué estrategia funcionó mejor en el pasado, si la estrategia de marketing actual funciona bien y qué mejoras se pueden hacer. La extracción de datos dinámicos le permite evaluar el éxito de una estrategia de marketing en tiempo real y realizar ajustes precisos en consecuencia.
**¿Cómo podemos acceder fácilmente a los datos dinámicos?**
Para recopilar datos dinámicos de manera oportuna y continua, copiar y pegar manualmente tradicional ya no es práctico. Una herramienta de web scraping fácil de usar podría ser la solución óptima en este caso con las siguientes ventajas:
**Codificación libre**
Con una herramienta de web scraping, no necesita tener conocimientos previos de programación. Scraping datos dinámicos de la web es fácil de lograr para cualquier persona y cualquier empresa.
**Funciona para todo tipo de sitios web.**
Los diferentes sitios web tienen diferentes estructuras, por lo que incluso un programador experimentado necesita estudiar la estructura de un sitio web antes de escribir los guiones. Pero se puede usar una herramienta de raspado web potente para extraer de diferentes sitios web de forma rápida y fácil, ahorrándole toneladas de tiempo al estudiar los diferentes sitios web.
**Extracciones datos programadas**
Esto requiere que la herramienta de web scraping sea compatible con la operación en la nube, en lugar de ejecutarse solo en una máquina local. De esta forma, el raspador (scraper) puede ejecutarse para extraer datos de acuerdo con su programación preferida automáticamente.
**Octoparse Extracción en la Nube de puede hacer más que eso.**
**Horario flexible**
La extracción en la nube de Octoparse admite el scraping de datos web en cualquier momento y con cualquier frecuencia según sus necesidades.
&#x200B;
https://preview.redd.it/ggpx9p4vhzy51.png?width=515&format=png&auto=webp&v=enabled&s=803c394a5f748a88031abe656d8050a0cbe8f8ae
**Mayor velocidad para la recopilación de datos.**
Con 6-20 servidores en la nube trabajando simultáneamente, el mismo conjunto de datos puede acelerar hasta 6-20 veces más rápido que si se ejecuta en una máquina local.
**Eficiencia de costo**
Octoparse extracción en la nube admite el funcionamiento del scraper y el almacenamiento de datos dinámicos en la nube, sin preocuparse por el alto costo de mantenimiento del hardware o la interrupción de la red.
Además, un costo 50% menor en comparación con servicios similares en el mercado, Octoparse se dedica a mejorar el valor del análisis de datos, equipando a todos con acceso asequible a Big Data. extracción en la nube admite el funcionamiento del scraper y el almacenamiento de datos dinámicos en la nube, sin preocuparse por el alto costo de mantenimiento del hardware o la interrupción de la red.
&#x200B;
https://preview.redd.it/gx9swf3whzy51.png?width=585&format=png&auto=webp&v=enabled&s=255cf60e2c89b53c17fa261466ea64d4e575d531
**API, personaliza tus feeds de datos**
Aunque los datos de la nube se pueden exportar a nuestra base de datos automáticamente, con API, puede mejorar mucho la flexibilidad de los datos e...
jter9r
u_melisaxinyue
melisaxinyue
t3_jter9r
https://www.reddit.com/r/u_melisaxinyue/comments/jter9r/extracción_de_datos_dinámicos_en_tiempo_real/
11/13/2020 10:25:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Extracción de Datos Dinámicos en Tiempo Real
False
1
jter9r
0
7400
5
5
5
0.419463087248322
0
0
0
0
641
53.7751677852349
1192
Red
10
Dash Dot Dot
20
No
855
Posted
8/2/2021 6:23:46 AM
Con motivo del lanzamiento de un negocio en línea tiene un costo inicial mínimo o nulo, los aspirantes a empresarios probablemente se enfrentarán a varios rivales que pueden intentar rebajar sus precios. Por lo tanto, es importante monitorear a tus competidores para determinar qué productos ofrecen a qué precio.
[Monitorear las listas de productos de la competencia](https://www.octoparse.es/help) te brindará una gran cantidad de información valiosa sobre tus competidores; tal vez un rival bien financiado esté realizando pruebas de penetración en uno de sus nichos, o tal vez esté probando un nuevo modelo de precios de la competencia. Independientemente de tu negocio, saber qué están haciendo tus competidores es uno de los primeros pasos para el éxito.
**Tabla de contenido**
* [Identificar estrategias de precios óptimas utilizando listados de la competencia](https://www.octoparse.es/blog/seguimiento-de-la-competidores-para-la-estrategia-de-precios-y-la-planificacion-de-productos#Identifying%20optimal%20pricing%20strategies%20using%20competitor%E2%80%99s%20listings)
* [Comparación de la variedad de productos con los listados de la competencia](https://www.octoparse.es/blog/seguimiento-de-la-competidores-para-la-estrategia-de-precios-y-la-planificacion-de-productos#%3EComparing%20product%20assortment%20against%20competitors%E2%80%99%20listings)
&#x200B;
[Seguimiento de la Competidores para la Estrategia de Precios y la Planificación de Productos](https://preview.redd.it/rhocnlee1we71.png?width=1600&format=png&auto=webp&v=enabled&s=7a9c1f60e1511760ae82161d56f9084a39eaa670)
Identificar [estrategias de precios óptimas](https://www.octoparse.es/help) utilizando listados de la competencia
El precio es uno de los factores más importantes en la decisión de compra. Todos los clientes tienen en cuenta el precio al comprar un producto, y es más probable que los clientes investiguen el producto a medida que aumenta el precio. Por eso es importante saber cómo los competidores ponen precio a un producto similar al tuyo en un canal diferente y luego mantener un precio competitivo. Esto no siempre significa reducir los precios de tus productos—de hecho, es posible que descubras que has estado cobrando menos por un producto de lo que deberías. Para comenzar a crear una estrategia de precios utilizando los datos de la lista de la competencia, comience respondiendo las siguientes preguntas:
### ¿Cuál es mi modelo de precios?
Empezar por identificar el [modelo general de precios](https://www.volusion.com/blog/how-to-price-ecommerce-products-to-compete-online/) que mejor se adapte a tus productos. Si vendet repuestos de cartuchos de tóner a pequeñas empresas, probablemente estés trabajando en un modelo económico, tratando de rebajar los precios de los competidores. Si vendes bolsos de lujo a celebridades, bajar los precios podría afectar las ventas. Ya sea que estés trabajando en un modelo de precios basado en el costo o en el valor, es fundamental saber a cuánto se venden los productos o servicios de la competencia.
### ¿Cómo perciben mis clientes los valores de productos?
Quizás hayas reducido previamente el precio de un producto, pero finalmente no vieron un aumento en las ventas. Es probable que esto se deba a que tus clientes vieron tu producto de manera diferente a lo que transmitía el precio más bajo. Los precios más baratos no siempre determinan las decisiones de compra—en cambio, los consumidores están más dispuestos a pagar un precio que consideran "razonable" por el producto que están comprando. Por esta razón, los minoristas deben establecer precios basados en la percepción del producto para ofrecer el mejor valor percibido del producto. Comparar los precios con tus competidores es una forma de ayudarlo a comprender cómo los consumidores perciben productos similares.
## Comparación de la variedad de productos con los listados de la competencia
Con el monitoreo de la competencia, tu enfoque principal es realizar evaluaciones comparativas con tus competidores para descubrir brechas en el surtido. [Monitorear los productos que tus competidores](https://dataservice.octoparse.com/comercio-electronico-y-venta-minorista?__hstc=97730752.cfa4be011358393efe2a4d1b0e579f03.1626416084584.1627872776490.1627884756794.36&__hssc=97730752.4.1627884756794&__hsfp=1029763304) han agregado recientemente a sus tiendas puede brindarte información valiosa sobre las tendencias del mercado. Si tienes una tienda en línea que vende artículos deportivos y observa que tus competidores agregan más mancuernas de varias marcas, puedes suponer que ven que el valor de mercado de las mancuernas aumenta. Esto sugiere que deberías ampliar tu surtido si no tienes mancuernas en tu tienda.
### ¿Quiénes son mis verdaderos competidores?
Es esencial comprender la variedad de productos de tus competidores y estar preparado para adaptarte al mercado. Sin embargo, también es importante poder reconocer quiénes son tus verdaderos competidores. Con mucha frecuencia, cuando pensamos en competidores, pensamos en empresas dentro de la misma industria; si bien esto parece correcto en teoría, no siempre es cierto en la práctica. Compare con los rivales equivocados, y tu estrategia de precios y surtido se quedará corta.
Para encontrar el objetivo correcto, debes calcular el índice de precios. Un concepto tomado del sector económico, el índice de precios se utiliza para medir la tasa de inflación. En el ámbito del comercio electrónico, podemos utilizarlo para examinar cuánto impacto tendrán los competidores en tu negocio.
Para medir el índice de precios de un determinado producto (por ejemplo, las mancuernas mencionadas anteriormente), debes dividir el costo de la mancuerna de la competencia por el costo de la mancuerna de tu tienda y multiplicar por 100.
Lee sobre: [¿Por Qué Necesitas Un Raspador De Comercio Electrónico Para El seguimiento De la Competencia?](https://www.octoparse.es/help)
[Web Scraping en la solución Big Data](https://octoparsewebscraping.medium.com/web-scraping-in-the-big-data-solution-7d2804d41477)
**Índice de Precios = (Costo del Producto de los Competidores / Costo de Tu Producto) x 100**
El índice de precios de un producto no nos dirá nada valioso—debemos sumar el índice de precios de cada producto y dividirlo por la cantidad de productos para obtener el índice de precios promedio del producto. Repetir los cálculos anteriores nos proporcionará el índice de precios de cada producto por competidor. A partir de ahí, podemos conectar todos los puntos de datos a través de un gráfico visual y trabajar en las desviaciones para determinar qué competidor tiene el mayor impacto en nosotros.
&#x200B;
[ Índice de Precios](https://preview.redd.it/0t1stz8k1we71.jpg?width=666&format=pjpg&auto=webp&v=enabled&s=f2a22d6fdd1a453bc06e416e5fc4aadbc09c58cf)
**¿Cómo realizo el seguimiento de la competencia?**
Many data solution providers charge a lot of money just for competitor monitoring. However, despite the high cost, you still have to deal with the underlying problem of security. Web scraping tools like [Octoparse](https://www.octoparse.es/) serve as an alternative for prudent investors who are conservative on security while careful on spending. Octoparse provides businesses of any size with the ability to stay informed automatically, allowing retailers to keep an eye on all categories of each competitor across different web sources at a much lower cost.
Muchos proveedores de soluciones de datos cobran mucho dinero solo por el seguimiento de la competencia. Sin embargo, a pesar del alto costo, aún debes lidiar con el problema subyacente de la seguridad. Las herramientas de raspado web como Octoparse sirven como una alternativa para los inversores prudentes que son conservadores en materia de seguridad pero cuidadosos con el gasto. Octoparse brinda a las empresas de cualquier tamaño la capacidad de mantenerse informadas automáticamente, lo que permite a los minoristas vigilar todas las categorías de cada competidor en diferentes fuentes web a un costo mucho menor.
El dicho "Mantén a tus amigos cerca de ti y a tus enemigos ...
ow8xgq
u_melisaxinyue
melisaxinyue
t3_ow8xgq
https://www.reddit.com/r/u_melisaxinyue/comments/ow8xgq/seguimiento_de_competidores_para_la_estrategia_de/
8/2/2021 6:23:46 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Seguimiento de Competidores para la Estrategia de Precios y la Planificación de Productos para 2021
False
1
ow8xgq
0
7400
5
5
4
0.317965023847377
3
0.238473767885533
0
0
689
54.7694753577107
1258
Red
10
Dash Dot Dot
20
No
854
Posted
8/23/2021 8:08:29 AM
"Vas a saber lo poderosa que es la expresión regular una vez que la uses". - Un desarrollador suspira de corazón.
[https:\/\/fireship.io\/lessons\/regex-cheat-sheet-js\/](https://preview.redd.it/u2tn7l97f2j71.png?width=1920&format=png&auto=webp&v=enabled&s=6c2c9117265100eb8a1dcdcce7b92e342e806b1e)
## ¿Qué es una expresión regular (RegEx)?
“Una expresión regular (a veces llamada expresión racional) es una secuencia de caracteres que definen un patrón de búsqueda, principalmente para su uso en la coincidencia de patrones con cadenas, o coincidencia de cadenas, es decir, operaciones similares a" buscar y reemplazar ".
El concepto surgió en la década de 1950, cuando el matemático estadounidense Stephen Kleene formalizó la descripción de un lenguaje regular y se volvió de uso común con la utilidad de procesamiento de texto de Unix ed (un editor de línea para el sistema operativo Unix), un editor y grep. (una utilidad de línea de comandos para buscar conjuntos de datos de texto sin formato para líneas que coincidan con una expresión regular), un filtro (un programa de computadora o subrutina para procesar una secuencia, produciendo otra secuencia) ". Este es un extracto de [Wikipedia](https://es.wikipedia.org/wiki/Regular_expression) que se utiliza para definir la expresión regular.
[**Sintaxis de expresiones regulares**](https://docs.python.org/3/library/re.html)
Las expresiones regulares se pueden concatenar para formar nuevas expresiones regulares; si A y B son expresiones regulares, AB también es una expresión regular. En general, si una cadena p coincide con A y otra cadena q coincide con B, la cadena pq coincidirá con AB. Esto es válido a menos que A o B contengan operaciones de baja precedencia; condiciones de contorno entre A y B; o tener referencias de grupo numeradas. Por lo tanto, las expresiones complejas se pueden construir fácilmente a partir de expresiones primitivas más simples como las que se describen aquí.
Las expresiones regulares pueden contener tanto caracteres especiales como ordinarios. La mayoría de los caracteres ordinarios, como 'A', 'a' o '0', son las expresiones regulares más simples; simplemente se emparejan a sí mismos. Puedes concatenar caracteres ordinarios, por lo que último coincide con la cadena 'último'. (En el resto de esta sección, escribiremos RE en este estilo especial, generalmente sin comillas, y las cadenas deben coincidir 'entre comillas simples').
## ¿Qué puedes hacer con RegEx?
Las expresiones regulares se pueden utilizar para hacer coincidir etiquetas HTML y extraer los datos en documentos HTML.
### A continuación, se muestran algunos casos de uso de RegEx:
#### [Uso de RegEx para extraer correos electrónicos](http://www.octoparse.es/blog/extraer-email-de-cadenas-o-archivos-txt)
#### [Uso de RegEx para extraer números de teléfono](https://octoparse.es/blog/regex-how-to-extract-all-phone-numbers-from-strings)
#### [RegEx para reformatear los datos extraídos](http://www.octoparse.es/tutorial-7/re-format-data-extracted)
HTML se compone virtualmente de cadenas, y lo que hace que la expresión regular sea tan poderosa es que una expresión regular puede coincidir con diferentes cadenas.
Es cierto que una expresión regular no es la primera opción para analizar HTML correctamente, porque existen algunos errores comunes, como etiquetas de cierre faltantes, algunas etiquetas no coincidentes, etc. al analizar HTML con expresión regular. Además, es más probable que los programadores usen otros analizadores HTML perfectamente buenos como PHPQuery, BeautifulSoup, html5lib-Python, etc. Pero si deseas hacer coincidir rápidamente etiquetas HTML y sabes un poco sobre la sintaxis de expresiones regulares, es fácil de aprender pero difícil para dominar, puedes utilizar esta herramienta increíblemente conveniente para identificar patrones en documentos HTML.
Se recomienda encarecidamente a todo programador o alguien que desee extraer datos web que aprenda expresiones regulares porque esta herramienta mejora la eficiencia y la productividad de tu trabajo.
Veamos algunos **ejemplos**:
* Expresiones regulares para coincidir con las etiquetas HTML:
<(.\*)>.\*?|<(.\*) />
<(\\S\*?)\[\^>\]\*>.\*?</\\1>|<.\*?/>
* Expresión regular para coincidir con todas las etiquetas TD:
<td\\s\*.\*>\\s\*.\*<\\/td>
* Expresión regular para coincidir con <img src = "test.gif" />:
<\[a-zA-Z\]+(\\s+\[a-zA-Z\]+\\s\*=\\s\*("(\[\^"\]\*)"|'(\[\^'\]\*)'))\*\\s\*/>
Podemos hacer coincidir una variedad de etiquetas HTML mediante el uso de una expresión regular y, por lo tanto, extraer datos fácilmente en documentos HTML.
https://preview.redd.it/mvchv7sff2j71.png?width=600&format=png&auto=webp&v=enabled&s=c4aef5a3cfd3e1303d4d7f5c3ddbc400033ac179
([Descargar Octoparse](https://octoparse.es/download/windows) \- Abrir el software - Hacer clic en el icono de la caja de herramientas en la esquina inferior izquierda)
## [Herramienta RegEx gratuita - Octoparse](https://octoparse.es/)
[Octoparse](https://octoparse.es/), una herramienta de recopilación de datos web visual, proporciona una herramienta para generar expresiones regulares. Puede generar fácilmente algunas expresiones regulares simples para satisfacer tus diferentes necesidades de extraer contenido en documentos HTML. Además, Octoaprse es totalmente compatible con la verificación de expresiones regulares personalizadas.
p9va3w
u_melisaxinyue
melisaxinyue
t3_p9va3w
https://www.reddit.com/r/u_melisaxinyue/comments/p9va3w/uso_de_expresiones_regulares_para_coincidir_con/
8/23/2021 8:08:29 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Uso de expresiones regulares para coincidir con HTML
False
1
p9va3w
0
7400
5
5
1
0.126262626262626
2
0.252525252525253
0
0
459
57.9545454545455
792
Red
10
Dash Dot Dot
20
No
853
Posted
11/13/2020 9:46:03 AM
https://preview.redd.it/wbwllnlmazy51.png?width=768&format=png&auto=webp&v=enabled&s=a9bb98bd0cdbc94ef4a5f8890128b9d82df101a9
Scraping las probabilidades de apuestas dinámicas de las agencias de apuestas en línea es un recurso estadístico importante para el análisis deportivo, como la predicción del ganador, el valor del equipo. O simplemente para hacer una apuesta de bajo riesgo.
En este artículo, me gustaría abordar las siguientes tres preguntas:
&#x200B;
* ¿Por qué debemos eliminar las probabilidades de apuesta?
* ¿Cómo raspar las cuotas de apuestas, más fácil y más rápido?
* ¿Cómo automatizar la actualización de las cuotas de apuestas en la base de datos de manera consistente?
**¿Por qué debemos scrape las probabilidades de apuesta?**
Las agencias de apuestas profesionales hacen su fortuna calculando sus probabilidades de apuestas para maximizar las ganancias y evitar grandes pagos. Establecieron el modelo estadístico con un gran conjunto de datos. Y luego calcule las probabilidades promedio, y haga la predicción una vez que calcule los valores atípicos.
Por un lado, el cambio en la pista de las probabilidades de apuestas refleja dónde las personas hacen sus apuestas. Cuantas más apuestas haya, menores serán las probabilidades. Por otro lado, Las compañías de juegos usarán "Hedging" para contrar sus propias pérdidas y ganancias.
¿Es posible venir con un método para vencer a las agencias de apuestas? En primer lugar, necesitamos descubrir la correlación entre las probabilidades de las agencias de apuestas y los resultados reales. Podemos scrape las probabilidades informadas por las agencias de apuestas y los resultados reales del competición en cada competición. Como resultado, podremos comparar y generar un modelo de predicción.
**¿Cómo scrape las probabilidades de apuestas?**
En este artículo, le mostraré cómo scrape las probabilidades de apuestas de un sitio de comparación de probabilidades. También puede [descargar la tarea de scraping](https://www.dropbox.com/s/8soip1z2oo962f7/WorldCupOdds2018Worl_Copy.otd?dl=0) para ejecutarla en su extremo.
Para continuar, debe tener una cuenta de Octoparse y [descargar la aplicación](https://www.octoparse.es/Download) gratuita en su computadora.
**Paso 1: Crear la Tarea y Abrir la web**
**1.1.** Crearemos la tarea con el Modo avanzado. Ingrese la URL del sitio web de apuestas. Y luego haga clic en "Guardar URL" en la parte inferior de la interfaz.
&#x200B;
https://preview.redd.it/eywc37gnazy51.png?width=560&format=png&auto=webp&v=enabled&s=083e051fc3bc5ab54d88fab3b9f18b0f51dc3681
**1.2.** Cambie el botón "Workflow". Esto nos permite verificar nuestro flujo de trabajo convenientemente.
&#x200B;
https://preview.redd.it/tcidyh6oazy51.png?width=839&format=png&auto=webp&v=enabled&s=a40cef0379aef4acba14edbbda7a74f6b1ab2ba2
**Paso 2: Selecciona los Datos y Extrae**
**2.1.** En el navegador incorporado, haga clic en el nombre de región/país y luego haga clic en el botón de expansión en la parte inferior del panel "Consejos de acción". Por lo tanto, Octoparse ampliará la selección de "Table Cell" (TD) a "Table Row" (TR).
&#x200B;
https://preview.redd.it/bbh3nsooazy51.png?width=886&format=png&auto=webp&v=enabled&s=4fce65448e4a008e3841928d6759f37b519287c1
**2.2.** Haga clic en el comando "Select all sub-elements" en el panel "Action Tips". Al hacer esto, Octoparse puede seleccionar todos los datos en la misma fila.
&#x200B;
https://preview.redd.it/7iqpazdpazy51.png?width=883&format=png&auto=webp&v=enabled&s=aa375dd86890544b2443fa97a4bc0f75faa9fb0b
**2.3.** Haga clic en el comando "Select all" en el panel "Action Tips". Por lo tanto, Octoparse seleccionará todos los datos de todas las filas de la tabla. Por último, pero no menos importante, haga clic en el comando "Extract data".
https://preview.redd.it/y1693b1qazy51.png?width=868&format=png&auto=webp&v=enabled&s=9208e7d0b420acc4596fda9df1a2c92bb179ffe6
Ahora Octoparse mostrará la información extraída en el campo de datos.
&#x200B;
https://preview.redd.it/nzwwi90razy51.png?width=619&format=png&auto=webp&v=enabled&s=6c602b8502739e3c3679cae0452943b954cf4bef
**Paso 3: Filtrar los Datos Extraídos**
Si la información extraída en el campo de datos es la que esperaba, puede omitir este paso. Sin embargo, si no es lo que desea, puede volver a seleccionar los datos, repita el paso anterior hasta obtener el correcto. De lo contrario, asegúrese de que XPath sea correcto. ([Para obtener más información sobre XPath, haga clic aquí](https://www.octoparse.es/tutorial-7/herramienta-octoparse-xpath))
**3.2.** Edite el nombre del campo y personalice el campo de datos si es necesario. Luego haga clic en "OK" para guardar todas las configuraciones.
&#x200B;
https://preview.redd.it/l4ng4q2sazy51.png?width=636&format=png&auto=webp&v=enabled&s=dbf150512884d85c6f4843b2116a7be6f5ebaa2e
Consejos: Podríamos agregar el tiempo actual de extracción haciendo clic en "Add predefined fields" en la parte inferior del "Campo de datos".
**Paso 4: Ejecute el Scraper y Obtenga los Datos**
Se completa el flujo de trabajo general. Simplemente haga clic en "Save" y "Iniciar extracción", obtendremos las probabilidades de apuesta.
&#x200B;
https://preview.redd.it/gpyi6rpsazy51.png?width=405&format=png&auto=webp&v=enabled&s=73d362f2aa2ae6757abe529b9dd7ee8783368d80
Cuando finalice la extracción de datos, podríamos exportar a Excel, CSV, JSON, HTML o base de datos para su posterior análisis.
&#x200B;
https://preview.redd.it/h6svznctazy51.png?width=913&format=png&auto=webp&v=enabled&s=6883e1632ba433c7d19602c19ea958ebe0d801df
**¿Cómo podemos actualizar automáticamente las cuotas de apuestas en database de manera consistente?**
**Solució A:** [**Standard Plan**](https://www.octoparse.es/Pricing)
**Primero**, [programe las tareas en "Cloud Extraction" ](https://www.octoparse.es/tutorial-7/cloud-extraction/)con la frecuencia requerida. Por ejemplo, establecer como intervalo de 5 minutos. Luego, la tarea se ejecutará a intervalos de 5 minutos automáticamente. Esta función es de vital importancia para mantener los datos actualizados para que no te pierdas ningún número impar.
&#x200B;
https://preview.redd.it/8by9090uazy51.png?width=456&format=png&auto=webp&v=enabled&s=91c32fd38aa976c49b3daefdb0fba2a0a81e553d
**En segundo lugar,** conéctese a [Octoparse API](http://dataapi.octoparse.com/help). De esta manera, podemos tener los datos extraídos entregados automáticamente a la base de datos sin acceder a la aplicación Octoparse.
**Solución B**: [**Professional Plan**](https://www.octoparse.es/Pricing)
Conectarse a [Octoparse Advanced API](http://advancedapi.octoparse.com/help) podría controlar la tarea (ejecutar o detener) y obtener los datos de nuestro sistema.
Más allá de esto, podría tener más crawlers, hasta 250 y 20 tareas simultáneas de extracción en la nube. Dicho esto, podríamos importar datos dinámicos (cuotas de apuestas o información del equipo) a su base de datos de hasta 20 fuentes/sitios web.
**Conclusión**
El valor de la herramienta de scraping es permitirnos extraer datos web a gran escala en diferentes sitios web al mismo tiempo. Con el mismo método, podríamos extraer información de otros sitios web y enriquecer nuestra base de datos para expandir las métricas y realizar un análisis más completo para predecir el ganador.
No olvides [compartir con nosotros](https://twitter.com/Octoparsehola) tu propio resultado analítico del ganador más probable de la Copa Mundial de la FIFA 2018. Si tiene dificultades para construir el scraper, simplemente [envíenos un mensaje](https://www.octoparse.es/contact).
Related Resources
[Extracting dynamic data with Octoparse ](https://www.octoparse.es/blog/extracting-dynamic-data-with-octoparse/)
[Schedule/run the task in the cloud ](https://www.octoparse.es/tutorial-7/cloud-extraction/)
[Create your first scraper...
jteczh
u_melisaxinyue
melisaxinyue
t3_jteczh
https://www.reddit.com/r/u_melisaxinyue/comments/jteczh/scraping_y_crawling_análisis_de_cuotas_de/
11/13/2020 9:46:03 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping y Crawling Análisis de Cuotas de Apuestas Deportivas
False
1
jteczh
0
7400
5
5
6
0.591715976331361
7
0.690335305719921
0
0
544
53.6489151873767
1014
Red
10
Dash Dot Dot
20
No
852
Posted
10/8/2021 7:57:26 AM
Octoparse facilita la extracción de datos de sitios web y automatiza los flujos de trabajo en la web. Y ahora, Zapier te permite integrar los datos extraídos de Octoaprse con más de 2000 aplicaciones, lo que significa que, una vez que domines [la integración de Octoparse para Zapier](https://zapier.com/apps/octoparse/integrations), puedes conectar Octoparse con aplicaciones que incluyen Google Drive, Google Sheets, Dropbox, Trello, Slack y cargar más. aplicaciones en un segundo SIN CÓDIGO.
https://preview.redd.it/b3k6lgvzm6s71.png?width=1203&format=png&auto=webp&v=enabled&s=a065be5e854cd78d3e282074dbdbba44f5c554ff
## Qué es Zapier
[**Zapier**](https://zapier.com/) **es una plataforma en línea que te permite automatizar los flujos de trabajo conectando las aplicaciones y los servicios que utilizas.**
Una gran cantidad de proveedores de software y servicios alojan su fácil acceso a la API en Zapier. Esto te permite configurar un flujo de trabajo de automatización entre ellos en lugar de escribir código o configurar una API para construir esta integración.
Mira todas las aplicaciones en Zapier: [https://zapier.com/apps](https://zapier.com/apps)
Para dar un ejemplo, Zapier puede enviar automáticamente un archivo de datos después de que el informe de datos esté listo. Los destinos del archivo se pueden seleccionar de las aplicaciones alojadas en Zapier y puedes administrar muchas tareas automáticas para diferentes plataformas o destinos en una plataforma.
## Cómo actúa Zapier
**Zapier conecta aplicaciones web moviendo fácilmente datos de una a otra. Cada vez que ocurre un evento en la aplicación A, desencadena una acción específica en la aplicación B.**
Lo hace permitiéndote crear “Zaps”, también conocido como integración Zapier, flujos de trabajo automatizados que consisten en un disparador y una o más acciones. Cuando configuras y activas un nuevo Zap, ejecutará sus acciones cada vez que ocurra el evento de disparo.
Para volver a ese ejemplo: digamos que estableces un Zap entre una tarea de Octoparse y Dropbox. En este caso, el cambio en el estado de la tarea (como un disparador) conducirá a un evento de entrega del conjunto de datos recién agregado a Dropbox. En este caso, el estado de finalización de la tarea es el evento de disparo y la carga de archivos a Dropbox es la acción automatizada que sigue.
## Cómo empezar con Zapier
* **Cuenta Zapier**
Para utilizar Zapier, debes [crear una cuenta Zapier](https://zapier.com/sign-up/). Puedes registrarte para obtener una cuenta gratuita, que te permite usar Zapier para 100 tareas por mes. Ten en cuenta que una tarea es diferente de un Zap, ya que puedes configurar un Zap que probablemente se usará más de una vez y generará un montón de tareas.
Cada vez que activas este Zap, le pides que realice una tarea. Esto significa que un Zap que usas una vez al día representa alrededor de 30 tareas al mes. También existe la opción de elegir uno de sus planes pagados, lo que te brinda más tareas por mes. Pero siempre puedes comenzar con una cuenta gratuita para descubrir qué puede hacer Zapier por ti.
* **Cuenta de Octoparse**
En el lado de Octoparse, debes tener una cuenta de Octoparse paga y una tarea procesable con datos existentes (ejecuta la tarea al menos una vez en la última media hora).
.Para ser más específico, deberás tener un plan estándar, un plan profesional o un plan empresarial para usar Zapier.
[Haz clic aquí para actualizar tu plan Octoparse si es necesario.](https://www.octoparse.es/login)
* **Cuenta para la aplicación de destino**
Deberás tener una cuenta que funcione para la aplicación de destino. Si tu destino es una base de datos, como Dropbox, primero deberás tener una cuenta de Dropbox.
## ¿Cómo se puede utilizar Zapier con Octoparse?
* Entregar automáticamente el archivo de datos de Octoparse a un software de administración de archivos en línea con Zapier (por ejemplo, Octoparse a Google Drive, Airtable, Tableau, Dropbox)
* Entregar automáticamente la fila de datos de Octoparse a una base de datos en línea con Zapier (por ejemplo, Octoparse a Google Sheet)
* Enviar automáticamente notificaciones de tareas desde Octoparse a la plataforma del equipo (por ejemplo, Octoparse a Slack, Lark, Skype)
.... hay más escenarios que puedes explorar con Ocatoparse Zapier. [Manténnos informados](https://helpcenter.octoparse.es/hc/es/requests/new) si tienes algún hallazgo interesante.
## Los 6 mejores Zaps para Octoparse
Para darte una idea de las opciones, estamos compartiendo esta descripción general en tiempo real de los cinco Zaps más populares en Octoparse en este momento. Por supuesto, esto no es todo lo que puedes hacer, ya que Zapier te permite conectarte a muchas otras plataformas. Pero puede darte una idea de las cosas que puedes hacer con Zapier en Octoparse:
https://preview.redd.it/hznds27dn6s71.jpg?width=806&format=pjpg&auto=webp&v=enabled&s=7e83487483038650a5a464d04ace364c3e4b46e6
## Solución de problemas:
A veces, al configurar tus tareas con Zapier, por ejemplo, los datos no se recuperan. Asegúrate de haber iniciado la tarea al menos una vez en la última media hora.
## Recursos
[Sitio web oficial de Zapier](https://zapier.com/)
[Precios de Zapier](https://zapier.com/pricing)
[Principales Aplicaciones & Software de Almacenamiento & Administración de Archivos | Zapier](https://zapier.com/apps/categories/files)
[Principales Aplicaciones y Software de Bases de Datos | Zapier](https://zapier.com/apps/categories/databases)
[Principales Aplicaciones & Software de Análisis | Zapier](https://zapier.com/apps/categories/analytics)
[Principales Aplicaciones & Software de Chat en Equipo | Zapier](https://zapier.com/apps/categories/team-chat)
## ¿Qué piensas que es Zapier para ti?
Para resumir todo lo anterior: Zapier es una herramienta de automatización que te permite automatizar ciertas partes de tu flujo de trabajo. La pieza y cómo se ve esta automatización depende totalmente de ti. Pero te permite ahorrar tiempo y concentrarte en el trabajo que más necesita tu atención.
## ¿Tienes alguna pregunta?
Ponte en contacto con nuestro equipo de soporte dedicado para obtener asistencia y asesoramiento en [support@octoparse.com](mailto:support@octoparse.com).
q3t6jk
u_melisaxinyue
melisaxinyue
t3_q3t6jk
https://www.reddit.com/r/u_melisaxinyue/comments/q3t6jk/cómo_conectar_octoparse_con_zapier/
10/8/2021 7:57:26 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Cómo Conectar Octoparse con Zapier
False
1
q3t6jk
0
7400
5
5
2
0.210304942166141
11
1.15667718191377
0
0
498
52.3659305993691
951
Red
10
Dash Dot Dot
20
No
851
Posted
9/13/2021 7:54:29 AM
Probablemente sepas cómo usar funciones básicas en Excel. Es fácil hacer cosas como ordenar, aplicar filtros, hacer gráficos y delinear datos con Excel. Incluso puedes realizar análisis de datos avanzados utilizando modelos de pivote y regresión. Se convierte en un trabajo fácil cuando los datos en vivo se convierten en un formato estructurado.
El problema es, ¿Cómo podemos extraer datos y ponerlos en Excel? Esto puede ser tedioso si lo haces manualmente escribiendo, buscando, copiando y pegando repetidamente. En cambio, puedes lograr la extracciñon automática de datos de la web para sobresalir.
En este artículo, te presentaré varias formas de ahorrar tiempo y energía, scrapear datos web en Excel.
https://preview.redd.it/591u4v0n78n71.png?width=1600&format=png&auto=webp&v=enabled&s=f9883033df24f7e90f2bf5d098682026f60b8cce
Descargo de responsabilidad: Hay muchas otras formas de[ scrapear datos desde una web](https://www.octoparse.es/) utilizando lenguajes de programación como PHP, Python, Perl, Ruby, etc. Aquí solo hablamos sobre cómo obtener datos de una web en Excel para no codificadores.
&#x200B;
**Tabla de contenidos**
Obtener datos web utilizando Excel Web Queries
Obtener datos de la web usando Excel VBA
Utilizar herramientas de web scraping automatizadas
Subcontratar tu proyecto de web scraping
## Obtener datos web utilizando Excel Web Queries
Excepto para transformar manualmente los datos de una página web copiando y pegando, Excel Web Queries se utiliza para recuperar rápidamente datos de páginas web estándar en hojas de cálculo de Excel. Puede detectar automáticamente tablas incrustadas en el HTML de la página web. Excel Web queries también se pueden usar en situaciones en las que es difícil crear o mantener una conexión estándar ODBC (Open Database Connectivity). Puede scrapear directamente una tabla desde cualquier sitio web utilizando Excel Web Queries.
El proceso se reduce a varios pasos simples (consulta [este artículo](https://www.excel-university.com/pull-external-data-into-excel)):
1. Ir a Datos> Obtener datos externos> Dar la web
2. Aparecerá una ventana del navegador llamada "New Web Query"
3. Escribir la dirección web en la barra de direcciones.
https://preview.redd.it/6fsxh8k088n71.jpg?width=645&format=pjpg&auto=webp&v=enabled&s=5a7251cf94bd1dd7053fb8c06d7448247b5ba594
(foto de excel-university.com)
4. Se cargará y mostrará iconos amarillos contra datos/tablas en la página.
5. Seleccionar uno apropiado
6. Presionar el botón Importar.
Ahora has scrapeado los datos de la web en una hoja de cálculo de Excel, perfecta permutación en filas y columnas como desees.
https://preview.redd.it/9aoog0k188n71.jpg?width=845&format=pjpg&auto=webp&v=enabled&s=94388699f3a1ef5de7208ecea3442ebd0e725b08
## Obtener datos de la web usando Excel VBA
La mayoría de nosotros usaría fórmulas en Excel (p. Ej. = Avg (...), = sum (...), = if (...), etc.) mucho, pero menos familiarizado con el lenguaje incorporado: Visual BasicVisual Basic for Application a.k.a VBA. Se conoce comúnmente como "Macros" y dichos archivos de Excel se guardan como a \*\*.xlsm.
Antes de usarlo,
**Primero** debes habilitar la pestaña la pestaña Desarrollador en la barra (hacer clic con el botón derecho en Archivo -> Personalizar barra -> verificar la pestaña Desarrollador),
**Luego** configura tu diseño. En esta interfaz de desarrollador, puedes escribir código VBA adjunto a varios eventos. Haz clic AQUÍ (https://msdn.microsoft.com/en-us/library/office/ee814737(v=office.14).aspx) para comenzar a utilizar VBA en Excel 2010.
https://preview.redd.it/l3x2i7n288n71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=b566736f44ab4d065ca42618eb2c1f16e283edf4
Usar Excel VBA va a ser un poco técnico, esto no es muy amigable para quienes no son programadores entre nosotros. VBA funciona ejecutando macros, procedimientos paso a paso escritos en Excel Visual Basic. Para scrapear datos de sitios web a Excel usando VBA, necesitamos construir u obtener un script VBA para enviar alguna solicitud a las páginas web y obtener datos devueltos de estas páginas web. Es común usar VBA con XMLHTTP y expresiones regulares para analizar las páginas web. Para Windows, puedes usar VBA con WinHTTP o InternetExplorer para scrapear datos de sitios web a Excel.
Con un poco de paciencia y práctica, te convendría aprender algo de código Excel VBA y algo de conocimiento HTML para que tu Web scraping en Excel sea mucho más fácil y eficiente para automatizar el trabajo repetitivo. Hay una gran cantidad de material y foros para que aprendas a escribir código VBA.
## Utilizar herramientas de web scraping automatizadas
Para alguien que está buscando una herramienta rápida para scrapear datos de las páginas a Excel y no quiere configurar el código VBA tú mismo, te recomiendo encarecidamente herramientas de web scraping automatizadas como [Octoparse](https://www.octoparse.es/) para scrapear datos para tu hoja de cálculo de Excel directamente o mediante API.
No hay necesidad de aprender a programar. Puedes elegir uno de esos programas gratuitos de web scraping de la [lista](https://www.octoparse.es/blog/las-20-mejores-herramientas-de-web-scraping) y comenzar a extraer datos de sitios web de inmediato y exportarlos a Excel. Las diferentes herramientas de web scraping tienen sus ventajas y desventajas, y puedes elegir la perfecta para tus necesidades.
Echa un vistazo a [esta publicación](https://www.octoparse.es/blog/30-mejores-herramientas-de-big-data-para-datos-analisis) y prueba estas TOP 30 herramientas gratuitas de web scraping.
## Subcontratar tu proyecto de web scraping
Si el tiempo es tu activo más valioso y deseas enfocarte en tus negocios principales, la mejor opción sería subcontratar un trabajo tan complicado de scrapear de contenido web a un equipo competente de scrapear de contenido web que tenga experiencia y conocimientos.
Es difícil scapear datos de sitios web debido al hecho de que la presencia de bots anti-scrape restringirá la práctica del web scraping. Un equipo competente de web scraping te ayudaría a obtener datos de los sitios web de manera adecuada y a entregarte datos estructurados en una hoja de Excel o en cualquier formato que necesites.
[Octoparse](https://www.octoparse.es/) proporciona todo lo que necesitas para la extracción automática de datos. Puedes scrapear los datos web rápidamente sin codificar y convierte las páginas web en datos estructurados con clics, o simplemente relájate y déjanos el trabajo a nosotros, ofrecemos servico de datos que nuestro equipo de datos se reunirá contigo para analizar el rastreo web y los requisitos de procesamiento de datos.
pnb05i
u_melisaxinyue
melisaxinyue
t3_pnb05i
https://www.reddit.com/r/u_melisaxinyue/comments/pnb05i/4_formas_de_extraer_datos_del_sitio_web_a_excel/
9/13/2021 7:54:29 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
4 Formas de Extraer Datos del Sitio Web a Excel
False
1
pnb05i
0
7400
5
5
31
3.16973415132924
0
0
0
0
572
58.4867075664622
978
Red
10
Dash Dot Dot
20
No
850
Posted
10/30/2020 4:24:41 AM
Es necesario que un reclutador obtenga suficientes pistas de reclutamiento calificadas. La forma de obtener abundantes oportunidades de reclutamiento de alta calidad puede ayudarlo a crear un grupo de talentos del que puede elegir una persona adecuada para su empresa cuando sea necesario.
[El web scraping](http://www.octoparse.es/) **es una técnica útil que puede ayudarlo en este problema y brindar beneficios más allá de la contratación:**
## [Parte 1: Elegir las plataformas más adecuadas](http://www.octoparse.es/blog/c%C3%B3mo-conseguir-pistas-de-reclutamiento-de-alta-calidad-con-web-scraping#h1)
## [Parte 2: Recopilar información de candidatos objetivo con web scraping](http://www.octoparse.es/blog/c%C3%B3mo-conseguir-pistas-de-reclutamiento-de-alta-calidad-con-web-scraping#h2)
## [Parte 3: Monitorear la información de contratación del competidor con web scraping](http://www.octoparse.es/blog/c%C3%B3mo-conseguir-pistas-de-reclutamiento-de-alta-calidad-con-web-scraping#h2)
**Parte 1: Elegir las plataformas más adecuadas**
Tradicionalmente, el reclutador hablaba con una gran cantidad de candidatos o revisaba muchos CV todos los días. Si está llegando a un grupo equivocado de candidatos, el camino de encontrar el empleado adecuado le resultará largo y agotador.
Como las personas se dividen en grupos y comunidades en Internet, el reclutamiento podría ser más fácil si elige el grupo adecuado. LinkedIn, Facebook y bolsas de trabajo como Indeed podrían ser plataformas eficaces para obtener clientes potenciales de contratación calificados.
# Parte 2: Recopilar información de candidatos objetivo con [web scraping](http://www.octoparse.es/)
Tomemos Indeed como ejemplo. Cuando apuntes a esta plataforma, comenzarás la búsqueda de talentos, mediante el cual podrás identificar el carácter de un candidato y evaluar su capacidad.
Ingrese palabras clave en la barra de búsqueda directamente para filtrar las personas que busca. La clave del web scraping es ayudarlo a extraer la información de la lista seleccionada a [EXCEL / CSV u otros formatos](http://www.octoparse.es/tutorial-7/export-extracted-data) estructurados disponibles para descargar a su archivo local.
Tradicionalmente, podemos copiar y pegar para obtener los resultados, pero llevaría mucho tiempo. Obtener los datos preparados en una forma estructurada puede brindarle un fácil acceso a la información y facilitar el proceso de seguimiento. Puede utilizar Octoparse para crear rastreadores con este fin o recurrir directamente al [servicio de datos de Octoparse](http://octoparse.es/). (Si le preocupa el problema legal, consulte [Diez mitos del raspado web](https://www.octoparse.com/blog/10-myths-about-web-scraping)).
Además, el reclutador también debe hacer un uso completo de los datos en línea para [optimizar su estrategia de reclutamiento](http://www.octoparse.es/blog/5-essential-data-mining-skills-for-recruiters).
📷
# Parte 3: Monitorear la información de contratación de la competencia con web scraping
Además de scrapear la información de los candidatos, el raspado web puede beneficiarlo de otra manera: monitorear la información de contratación de sus competidores, prepárese para un análisis de la competencia o de la industria.
&#x200B;
¿Por qué deberíamos monitorear la información de contratación de nuestro competidor?
**Figure out the true competitors**
Podría tener una lista de los competidores de la industria cuando busque una ocupación en una plataforma de contratación, como glassdoor.com. Por supuesto, limitará los resultados a aquellas empresas que ofrecen productos o servicios similares a usted, o que ofrecen productos o servicios subordinados al mismo grupo de audiencia.
📷
Puede scrapear los campos que se resaltaron en la imagen de arriba y extraerlos a un Excel. Los datos estarán bien estructurados (a continuación se muestran los datos de muestra extraídos de Indeed). Luego, puede seleccionar una empresa de destino en Excel filtrando para ver más de cerca a su competidor. (Si desea saber cómo extraer datos de glassdoor.com, consulte Extraer datos de trabajos de Glassdoor [Extraer datos de trabajos de Glassdoor](http://www.octoparse.es/tutorial-7/scrape-job-data-from-glassdoor))
📷
&#x200B;
**Analizar los datos para conocer el mercado laboral y la competencia en la contratación**
Al rastrear la información de contratación de sus competidores con web scraping, a veces puede obtener una imagen más amplia de la tendencia del mercado laboral en una industria determinada. Y si su empresa está pasando por un problema de rotación, estos datos de contratación pueden brindarle información sobre lo que está sucediendo dentro y fuera de su empresa.
En conclusión, para facilitar todo el proceso de selección y hacer un mejor trabajo en recursos humanos, necesitaríamos una base de datos de vacantes del mercado y candidatos abiertos para conocer la situación y preparar el grupo de talentos.
El web scraping es una forma poderosa de conocer realmente el big data. Puede comenzar utilizando una herramienta de raspado web como Octoparse, para acercarse al big data y obtener el valor de ello.
[Octoparse](http://www.octoparse.es/) **Youtube Channel**
jkq7l2
u_melisaxinyue
melisaxinyue
t3_jkq7l2
https://www.reddit.com/r/u_melisaxinyue/comments/jkq7l2/tips_para_reclutamiento_cómo_conseguir_pistas_de/
10/30/2020 4:24:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Tips para Reclutamiento: Cómo Conseguir Pistas de Reclutamiento de Alta Calidad con Web Scraping
False
1
jkq7l2
0
7400
5
5
4
0.5
0
0
0
0
432
54
800
Red
10
Dash Dot Dot
20
No
849
Posted
10/27/2021 4:17:53 AM
En el trabajo de [análisis de datos](https://es.wikipedia.org/wiki/An%C3%A1lisis_de_datos), hay un paso que nunca se puede omitir. Desempeña un papel vital en todo el trabajo de análisis de datos, pero a menudo se pasa por alto, es decir, la [**Limpieza de Datos**](https://es.wikipedia.org/wiki/Limpieza_de_datos). Cuando se trata de la limpieza de datos, muchas personas tienen una serie de preguntas en mente: ¿Qué es la limpieza de datos? ¿Qué necesita exactamente la limpieza de datos para lavar? ¿Cuáles son los pasos de la limpieza de datos? Ahora exploraré contigo uno por uno.
## ¿Qué es la limpieza de datos?
La limpieza de datos se refiere a la duplicación. El exceso de datos se filtra y elimina, los datos faltantes se complementan por completo, los datos erróneos se corrigen o eliminan y, finalmente, se clasifican en datos que podemos procesar y utilizar más adelante.
## ¿Qué debería eliminarse exactamente en la limpieza de datos?
Por definición, la limpieza de datos es para limpiar datos sucios, entonces, ¿qué datos se denominarán [datos sucios](https://www.tableau.com/es-es/learn/whitepapers/costs-of-dirty-data)? En el análisis de datos, a menudo necesitamos extraer algunos datos de la base de datos, pero debido a que la base de datos suele ser una colección de datos para un tema determinado, y estos datos se extraen de múltiples sistemas comerciales, inevitablemente contiene datos incompletos. Los datos incorrectos son muy repetitivos y estos datos se denominan datos sucios.
¿Cuál es la importancia de la limpieza de datos? La limpieza de datos tiene como objetivo mejorar la calidad de los datos y reducir la tasa de error en el proceso de estadísticas de datos. Antes del análisis de datos, necesitamos realizar la limpieza de datos con la ayuda de una computadora, que incluye principalmente la limpieza del rango efectivo de datos, la limpieza de la coherencia lógica de los datos y la verificación al azar de la calidad de los datos.
## Pasos de limpieza de datos
Echamos un vistazo a la ruta principal de limpieza de datos, como se muestra en la figura:
https://preview.redd.it/a1d7tjkj5xv71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=077185c549e4abb781958559e3ed623a8c4cd48a
### 1. Limpiar los valores perdidos
Los valores perdidos son el problema de datos más común y hay muchas formas de lidiar con los valores perdidos. Necesitamos seguir los pasos. La primera es determinar el rango de valores perdidos: calcular la proporción de valores perdidos para cada campo y luego formular estrategias basadas en la proporción de valores perdidos y la importancia del campo.
### 2. Eliminar los campos innecesarios
La operación de eliminar campos innecesarios es muy simple y se puede eliminar directamente. Pero lo que hay que recordar es que para limpiar los datos, se debe realizar una copia de seguridad de cada paso o probarlo con éxito en datos a pequeña escala, y luego procesar la cantidad completa de datos. Si borra los datos incorrectos, te arrepentirás.
### 3. Completar el contenido que falta
Esto se debe a que hay tres formas de completar algunos valores perdidos, es decir, de completar los valores perdidos según el conocimiento o la experiencia empresarial. Completar los valores faltantes con los resultados del cálculo del mismo indicador.
### 4. Volver a tomar el número
Debido a que ciertos indicadores son muy importantes y la tasa de faltas es alta, es necesario saber si el personal de acceso o el personal de negocios tienen otros canales para obtener datos relevantes. Este es el paso de limpiar los valores perdidos.
### 5. Verificación de relevancia
Si tus datos tienen varias fuentes, debes verificar la relevancia.
[Octoparse](https://www.octoparse.es/) ofrece opciones de limpieza de datos para convertir los datos extraídos en el formato que necesitas, puede [refinar los datos extraídos](https://helpcenter.octoparse.es/hc/es/articles/360056620474-Refinar-los-datos-extra%C3%ADdos-reemplazar-el-contenido-agregar-un-prefijo-) (reemplazar el contenido, agregar un prefijo, ..) mientras realizas el raspado web.
qgob15
u_melisaxinyue
melisaxinyue
t3_qgob15
https://www.reddit.com/r/u_melisaxinyue/comments/qgob15/comprender_los_3_problemas_principales_sobre_la/
10/27/2021 4:17:53 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Comprender los 3 Problemas Principales sobre la Limpieza de Datos
False
1
qgob15
0
7400
5
5
0
0
2
0.308641975308642
0
0
337
52.0061728395062
648
Red
10
Dash Dot Dot
20
No
848
Posted
8/23/2021 8:08:29 AM
"Vas a saber lo poderosa que es la expresión regular una vez que la uses". - Un desarrollador suspira de corazón.
[https:\/\/fireship.io\/lessons\/regex-cheat-sheet-js\/](https://preview.redd.it/u2tn7l97f2j71.png?width=1920&format=png&auto=webp&v=enabled&s=6c2c9117265100eb8a1dcdcce7b92e342e806b1e)
## ¿Qué es una expresión regular (RegEx)?
“Una expresión regular (a veces llamada expresión racional) es una secuencia de caracteres que definen un patrón de búsqueda, principalmente para su uso en la coincidencia de patrones con cadenas, o coincidencia de cadenas, es decir, operaciones similares a" buscar y reemplazar ".
El concepto surgió en la década de 1950, cuando el matemático estadounidense Stephen Kleene formalizó la descripción de un lenguaje regular y se volvió de uso común con la utilidad de procesamiento de texto de Unix ed (un editor de línea para el sistema operativo Unix), un editor y grep. (una utilidad de línea de comandos para buscar conjuntos de datos de texto sin formato para líneas que coincidan con una expresión regular), un filtro (un programa de computadora o subrutina para procesar una secuencia, produciendo otra secuencia) ". Este es un extracto de [Wikipedia](https://es.wikipedia.org/wiki/Regular_expression) que se utiliza para definir la expresión regular.
[**Sintaxis de expresiones regulares**](https://docs.python.org/3/library/re.html)
Las expresiones regulares se pueden concatenar para formar nuevas expresiones regulares; si A y B son expresiones regulares, AB también es una expresión regular. En general, si una cadena p coincide con A y otra cadena q coincide con B, la cadena pq coincidirá con AB. Esto es válido a menos que A o B contengan operaciones de baja precedencia; condiciones de contorno entre A y B; o tener referencias de grupo numeradas. Por lo tanto, las expresiones complejas se pueden construir fácilmente a partir de expresiones primitivas más simples como las que se describen aquí.
Las expresiones regulares pueden contener tanto caracteres especiales como ordinarios. La mayoría de los caracteres ordinarios, como 'A', 'a' o '0', son las expresiones regulares más simples; simplemente se emparejan a sí mismos. Puedes concatenar caracteres ordinarios, por lo que último coincide con la cadena 'último'. (En el resto de esta sección, escribiremos RE en este estilo especial, generalmente sin comillas, y las cadenas deben coincidir 'entre comillas simples').
## ¿Qué puedes hacer con RegEx?
Las expresiones regulares se pueden utilizar para hacer coincidir etiquetas HTML y extraer los datos en documentos HTML.
### A continuación, se muestran algunos casos de uso de RegEx:
#### [Uso de RegEx para extraer correos electrónicos](http://www.octoparse.es/blog/extraer-email-de-cadenas-o-archivos-txt)
#### [Uso de RegEx para extraer números de teléfono](https://octoparse.es/blog/regex-how-to-extract-all-phone-numbers-from-strings)
#### [RegEx para reformatear los datos extraídos](http://www.octoparse.es/tutorial-7/re-format-data-extracted)
HTML se compone virtualmente de cadenas, y lo que hace que la expresión regular sea tan poderosa es que una expresión regular puede coincidir con diferentes cadenas.
Es cierto que una expresión regular no es la primera opción para analizar HTML correctamente, porque existen algunos errores comunes, como etiquetas de cierre faltantes, algunas etiquetas no coincidentes, etc. al analizar HTML con expresión regular. Además, es más probable que los programadores usen otros analizadores HTML perfectamente buenos como PHPQuery, BeautifulSoup, html5lib-Python, etc. Pero si deseas hacer coincidir rápidamente etiquetas HTML y sabes un poco sobre la sintaxis de expresiones regulares, es fácil de aprender pero difícil para dominar, puedes utilizar esta herramienta increíblemente conveniente para identificar patrones en documentos HTML.
Se recomienda encarecidamente a todo programador o alguien que desee extraer datos web que aprenda expresiones regulares porque esta herramienta mejora la eficiencia y la productividad de tu trabajo.
Veamos algunos **ejemplos**:
* Expresiones regulares para coincidir con las etiquetas HTML:
<(.\*)>.\*?|<(.\*) />
<(\\S\*?)\[\^>\]\*>.\*?</\\1>|<.\*?/>
* Expresión regular para coincidir con todas las etiquetas TD:
<td\\s\*.\*>\\s\*.\*<\\/td>
* Expresión regular para coincidir con <img src = "test.gif" />:
<\[a-zA-Z\]+(\\s+\[a-zA-Z\]+\\s\*=\\s\*("(\[\^"\]\*)"|'(\[\^'\]\*)'))\*\\s\*/>
Podemos hacer coincidir una variedad de etiquetas HTML mediante el uso de una expresión regular y, por lo tanto, extraer datos fácilmente en documentos HTML.
https://preview.redd.it/mvchv7sff2j71.png?width=600&format=png&auto=webp&v=enabled&s=c4aef5a3cfd3e1303d4d7f5c3ddbc400033ac179
([Descargar Octoparse](https://octoparse.es/download/windows) \- Abrir el software - Hacer clic en el icono de la caja de herramientas en la esquina inferior izquierda)
## [Herramienta RegEx gratuita - Octoparse](https://octoparse.es/)
[Octoparse](https://octoparse.es/), una herramienta de recopilación de datos web visual, proporciona una herramienta para generar expresiones regulares. Puede generar fácilmente algunas expresiones regulares simples para satisfacer tus diferentes necesidades de extraer contenido en documentos HTML. Además, Octoaprse es totalmente compatible con la verificación de expresiones regulares personalizadas.
p9va3w
u_melisaxinyue
melisaxinyue
t3_p9va3w
https://www.reddit.com/r/u_melisaxinyue/comments/p9va3w/uso_de_expresiones_regulares_para_coincidir_con/
8/23/2021 8:08:29 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Uso de expresiones regulares para coincidir con HTML
False
1
p9va3w
0
7400
5
5
Red
10
Dash Dot Dot
20
No
847
Posted
9/30/2021 7:33:56 AM
# ¿Qué es el web scraping?
El [web scraping](https://www.octoparse.es/), también conocido como [web harvesting ](https://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping)y [extracción de datos web](http://www.dataextraction.io/), se refiere básicamente a la recopilación de datos de sitios web a través del Hypertext Transfer Protocol (HTTP) o mediante navegadores web.
Es una técnica web para extraer datos de la web. Convierte datos no estructurados o código fuente sin procesar en datos estructurados que puedes almacenar en tu computadora local o en una base de datos. Por lo general, los datos disponibles en Internet solo se pueden ver desde un navegador web. Casi todos los sitios web no brindan a los usuarios la funcionalidad para extraer la información que se muestra en la web. La única forma de obtener la información es mediante la acción repetitiva de copiar y pegar. Es una tarea tediosa y que requiere mucho tiempo capturar y separar manualmente estos datos.
Afortunadamente, la técnica de web scraping puede ejecutar el proceso automáticamente y organizarlos en minutos.
## ¿Cómo funciona el web scraping?
En general, el web scraping implica tres pasos:
* **Primero**, enviamos una solicitud GET al servidor y recibiremos una respuesta en forma de contenido web.
* **A continuación**, analizamos el código HTML de un sitio web siguiendo una ruta de estructura de árbol.
* **Finalmente**, usamos la python library para buscar el parse tree.
&#x200B;
## ¿Cómo comenzó todo?
Aunque para muchas personas, suena como una técnica tan fresca como conceptos como "Big Data" o "machine learning", la historia del web scraping es en realidad mucho más larga. Se remonta a la época en que nació la World Wide Web, o coloquialmente "Internet"
Al principio, Internet era incluso inescrutable. Antes de que se desarrollaran los motores de búsqueda, Internet era solo una colección de sitios de File Transfer Protocol (FTP) en los que los usuarios navegaban para encontrar archivos compartidos específicos. Para encontrar y organizar los datos distribuidos disponibles en Internet, las personas crearon un programa automatizado específico, conocido hoy como el **web crawler/bot**, para **buscar todas las páginas** en Internet y luego **copiar todo el contenido** en las bases de datos para su indexación.
Luego, Internet se crece y se convierte en el hogar de millones de páginas web que contienen una gran cantidad de datos en múltiples formas, incluidos textos, imágenes, videos y audios. Se convierte en una fuente de datos abierta.
A medida que la fuente de datos se hizo increíblemente rica y fácil de buscar, la gente comienzan a descubrir que la información requerida se puede encontrar fácilmente. Esta información generalmente se encuentra dispersa en muchos sitios web, pero el problema es que cuando desean obtener datos de Internet, no todos los sitios web ofrecen la opción de descargar datos. Copiar y pegar es muy engorroso e ineficiente.
Y ahí es donde entró el web scraping. El web scraping en realidad está impulsado por web bots/crawlers, y sus funciones son las mismas que las utilizadas en los motores de búsqueda. Es decir, **buscar y copiar**. La única diferencia podría ser la escala. El web scraping se centra en extraer solo datos específicos de ciertos sitios web, mientras que los motores de búsqueda a menudo obtienen la mayoría de los sitios web en Internet.
## - ¿Cómo se desarrolla el web scraping?
* **1989 El nacimiento de la World Wide Web**
Técnicamente, la World Wide Web es diferente de Internet. El primero se refiere al **espacio** de información, mientras que el segundo es la **network** compuesta por computadoras.
Gracias a Tim Berners-Lee, el inventor de WWW, trajo las siguientes 3 cosas que han sido parte de nuestra vida diaria:
* Localizadores Uniformes de Recursos (URL) que utilizamos para ir al sitio web que queremos;
* embedded hyperlinks que nos permiten navegar entre las páginas web, como las páginas de detalles del producto en las que podemos encontrar especificaciones del producto y muchas otras cosas como "los clientes que compraron esto también compraron";
* páginas web que contienen no solo textos, sino también imágenes, audios, videos y componentes de software.
* **1990 El primer navegador web**
También inventado por Tim Berners-Lee, se llamaba WorldWideWeb (sin espacios), llamado así por el proyecto WWW. Un año después de la aparición de la web, las personas tenían una forma de verla e interactuar con ella.
* **1991 El primer servidor web http:// web page**
La web siguió creciendo a una velocidad bastante moderada, en 1991 Tim Berners-Lee realizó el anuncio oficial de la World Wide Web y distribuyó el primer software de servidor web, con lo que marcaría el debut de esta herramienta como un servicio público en internet y cambiaría la historia para siempre. Para 1994, el número de servidores HTTP era superior a 200.
* **1993 Primer robot web - World Wide Web Wanderer**
En el 1993, Matthew Gray, quien estudió física en el Instituto de Tecnología de Massachusetts (MIT) y fue uno de los tres miembros de la Junta de Procesamiento de Información Estudiantil (SIPB) que creó el sitio www.mit.edu, decidió escribir un programa, llamado World Wide Web Wanderer, para recorrer sistemáticamente la Web y recopilar sitios.
Wanderer fue funcional por primera vez en la primavera de 1993 y se convirtió en el primer agente web automatizado (araña o rastreador web). El Wanderer ciertamente no llegó a todos los sitios de la Web, pero se ejecutó con una metodología coherente y, con suerte, arrojó datos coherentes para el crecimiento de la Web.
* **El diciembre de 1993 Primer motor de búsqueda crawler-based web JumpStation**
Como no había tantos sitios web disponibles en la web, los motores de búsqueda en ese momento solían depender de los administradores de sus sitios web humanos para recopilar y editar los enlaces en un formato particular.
JumpStation trajo un nuevo salto. Es el primer motor de búsqueda WWW que se basa en un robot web.
Desde entonces, la gente comenzó a usar estos web crawlers programáticos para recolectar y organizar Internet. Desde Infoseek, Altavista y Excite, hasta Bing y Google hoy, el núcleo de un robot de motor de búsqueda sigue siendo el mismo:
Como las páginas web están diseñadas para usuarios humanos, y no para la facilidad de uso automatizado, incluso con el desarrollo del bot web, todavía fue difícil para los ingenieros informáticos y los científicos hacer scraping web, y mucho menos personas normales. Por lo tanto, la gente se ha dedicado a hacer que el web scraping esté más disponible.
* **2000 Web API y API crawler**
API significa **Interfaz de Programación de Aplicaciones**. Es una interfaz que facilita mucho el desarrollo de un programa al proporcionar los bloques de construcción.
En 2000, Salesforce y eBay lanzaron su propia API, con la cual los programadores pudieron acceder y descargar algunos de los datos disponibles al público.
Desde entonces, muchos sitios web ofrecen API web para que las personas accedan a su base de datos pública.
*Enviar una solicitud HTTP pegada juntos, recibir JSON o XML a cambio*
Web APIs recopilan solo los datos proporcionados por el sitio web ,ofrecen a los desarrolladores una forma más amigable de hacer web scraping.
* **2004 Python Beautiful soup**
No todos los sitios web ofrecen API. Incluso si lo hacen, no proporcionan todos los datos que desean. Por lo tanto, los programadores todavía estaban trabajando en el desarrollo de un enfoque que pudiera facilitar el web scraping.
En 2004, Beautiful Soup fue lanzado. Es una biblioteca diseñada para Python.
En la programación de computadoras, una biblioteca es una colección de módulos de script, como los algoritmos de uso común, que permiten su uso sin reescritura, lo que simplifica el proceso de programación.
Con comandos simples, Beautiful Soup tiene sentido de la estructura del sitio y ayuda a analizar el contenido desde el contenedor HTML. Se considera la biblioteca más sofisticada y avanzada para el raspado web, y también uno de los enfoques más comunes y populares en la actuali...
pyew3s
u_melisaxinyue
melisaxinyue
t3_pyew3s
https://www.reddit.com/r/u_melisaxinyue/comments/pyew3s/servicios_de_web_scraping_cómo_comenzó_y_qué/
9/30/2021 7:33:56 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Servicios De Web Scraping: Cómo Comenzó y Qué Sucederá en El Futuro
False
1
pyew3s
0
7400
5
5
5
0.378501135503407
0
0
0
0
726
54.9583648750946
1321
Red
10
Dash Dot Dot
20
No
846
Posted
9/24/2021 9:02:19 AM
El análisis de datos permite a las empresas analizar todos sus datos (en tiempo real, históricos, no estructurados, estructurados, cualitativos) para identificar patrones y generar insights para informar y, en algunos casos, automatizar la toma de decisiones, vincular la inteligencia de datos a la acción. Las mejores soluciones de herramientas de análisis de datos de la actualidad respaldan el proceso de análisis de un extremo a otro, desde el acceso, la preparación y el análisis de los datos hasta la implementación del análisis y el seguimiento de los resultados.
https://preview.redd.it/ek46dn602fp71.jpg?width=672&format=pjpg&auto=webp&v=enabled&s=e9322917bf9d5945fa745c4afd136966659c6eb7
Echamos un vistazo a las funciones que debe tener las herramientas de análisis de datos.
&#x200B;
## 1. Inteligencia empresarial y generación de informes
Analizar datos y proporcionar información procesable a los ejecutivos comerciales y otros usuarios finales para que puedan tomar decisiones comerciales informadas es uno de los usos más importantes del análisis de datos. El análisis de datos, también conocido como "inteligencia empresarial", es un portal de información para cualquier empresa. Los consumidores, desarrolladores, modeladores de datos, gerentes de calidad de datos, ejecutivos de negocios, gerentes de operaciones y otros confían en los informes y paneles para ayudar a monitorear el progreso del negocio, el estado, las interrupciones, los ingresos, los socios, etc.
## 2. Organización de datos / Preparación de datos
Una buena solución de análisis de datos incluye funciones de preparación de datos y clasificación de datos de autoservicio factibles, que pueden recopilar datos de manera fácil y rápida de varias fuentes de datos incompletas, complejas o desordenadas, y limpiarlas para facilitar el mashup y el análisis.
## 3. Visualización de datos
Para recopilar información a partir de los datos, muchos analistas y científicos de datos confían en la visualización de datos o la representación gráfica de los datos para ayudar a las personas a explorar e identificar de forma intuitiva patrones y valores atípicos en los datos. Las excelentes soluciones de análisis de datos incluirán capacidades de [visualización de datos](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos) para hacer que la exploración de datos sea más fácil y rápida.
## 4. Análisis Geoespacial y de Ubicación
Si tu solución de análisis no incluye análisis geoespacial y de ubicación, entonces generalmente no tiene ningún sentido analizar grandes conjuntos de datos. Agregar esta capa de inteligencia al análisis de datos te permite desarrollar conocimientos y descubrir relaciones en los datos que quizás nunca haya visto antes. Puedes predecir mejor dónde están tus clientes más valiosos y cómo comprarán los productos.
## 5. Análisis Predictivo
Hoy en día, uno de los usos más importantes del análisis de datos comerciales es predecir eventos. Por ejemplo, predecir cuándo fallará una máquina o cuánto inventario se necesitará en una tienda en particular en un momento específico. El análisis predictivo implica la adquisición de datos históricos y la creación de modelos para ayudar a predecir eventos futuros. Tradicionalmente, la analítica avanzada ha sido el campo de científicos de datos, estadísticos e ingenieros de datos bien capacitados. Pero con el avance del software, los científicos de datos ciudadanos están desempeñando cada vez más estos roles. Muchas empresas de análisis predicen que los científicos de datos ciudadanos superarán a los científicos de datos en la cantidad de análisis avanzados generados.
## 6. Aprendizaje Automático
El aprendizaje automático implica la automatización de modelos de análisis iterativos utilizando algoritmos que pueden aprender iterativamente de los datos y optimizar el rendimiento. Con los algoritmos de aprendizaje automático para big data, puedes hacer que las computadoras funcionen para encontrar nuevos patrones y conocimientos sin tener que programar explícitamente su apariencia. Buscando soluciones de análisis de datos que puedan proporcionar búsqueda en lenguaje natural, análisis de imágenes y análisis mejorado.
## 7. Análisis de [flujo de datos](https://es.wikipedia.org/wiki/Diagrama_de_flujo_de_datos)
El procesamiento de eventos en tiempo real en coyunturas críticas se ha convertido en una función clave del análisis de datos actual. La extracción de datos de dispositivos de transmisión de IoT, fuentes de video, fuentes de audio y plataformas de redes sociales en tiempo real es una función básica de las principales soluciones analíticas de la actualidad.
pufvx6
bigdata
melisaxinyue
t3_pufvx6
https://www.reddit.com/r/bigdata/comments/pufvx6/cómo_elegir_una_herramienta_de_análisis_de_datos/
9/24/2021 9:02:19 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
¿Cómo elegir una herramienta de análisis de datos? ¿Qué funciones se requieren?
False
0.5
pufvx6
0
7400
5
5
0
0
0
0
0
0
377
55.6047197640118
678
Red
10
Dash Dot Dot
20
No
845
Posted
7/28/2021 9:30:02 AM
De acuerdo con [***Viajes National Geographic***](https://viajes.nationalgeographic.com.es/a/destinos-mas-populares_11415), dentro de los 25 desinos más populares del mundo según los viajeros en 2019, los de países hispanohablantes ocupan 4 posiciones. La industria del turismo estaba superando a la economía global hasta 2019, y luego cayó en picada en 2020. Sí, ya sabes la razón. Pero la buena noticia es que los viajes y el turismo están recuperando terreno. Y la industria del turismo está aprovechando la tecnología para acelerar las cosas. Una tecnología tan impactante y transformadora para el sector de viajes y turismo es Big Data. Esta información pretende ser tu guía sobre "**Big Data en el turismo**". Después de leer esta información, vas a conocer:
https://preview.redd.it/xsr7ta8aaxd71.png?width=1600&format=png&auto=webp&v=enabled&s=fb633e1b11345a497bfea6ed78c6481f638f9f96
**Tabla de contenido**
[Parte 1 ¿Qué es el big data en el turismo?](https://www.octoparse.es/blog/big-data-en-turismo#h1)
[Parte 2 ¿Cómo se puede aprovechar / extraer big data usando web scraping?](https://www.octoparse.es/blog/big-data-en-turismo#h2)
[Parte 3 Casos de uso empresarial de big data en el turismo](https://www.octoparse.es/blog/big-data-en-turismo#h3)
[Parte 4 Desafíos de Big Data en viajes & turismo](https://www.octoparse.es/blog/big-data-en-turismo#h4)
[Conclusión](https://www.octoparse.es/blog/big-data-en-turismo#h5)
# ¿Qué es el big data en el turismo?
¿Qué es Big Data? Big data es un enfoque para extraer, procesar y visualizar sistemáticamente grandes conjuntos de datos. ¿Qué significa eso? Suponga que tienes datos personales, es decir, nombre, edad, sexo, país, etcétera, de 1000 clientes que visitaron Francia en 2020. ¿Se trata de big data? Realmente no. Pero si tienes datos sobre un millón de personas que llegaron a diferentes rincones del mundo, el hotel en el que se hospedaron, la comida que pidieron, los lugares que visitaron, los eventos que tuvieron lugar en diferentes geografías, los diferentes recursos en línea / fuera de línea que utilizaron, etc., entonces, idealmente, debería considerarse Big Data. **En términos sencillos**,
* El big data suele tener un tamaño superior a varios terabytes (TB) y petabytes (PB) de datos..
* Es alto en volumen, variedad y velocidad.
* Para extraer, analizar y visualizar big data, los métodos tradicionales son ineficientes.
Entonces, usas
* Técnicas modernas de extracción de datos como web scraping para extraer datos de la web.
* Herramientas sofisticadas como Hadoop, KNIME, NodeXL, QlikView, FusionCharts, Watson analytics etc para realizar análisis de big data y visualizar las tendencias y patrones.
¿Qué tipo de big data se puede capturar en la industria del turismo?
1. **Información de destino**\- Lugares para Visitar, Reglas & Regulaciones, etc.,
2. **Datos de Hoteles & Restaurantes** \- Rating & Reseñas de Clientes, Precios, Detalles del Servicio, etc.
3. **Datos de Transporte** \- Billetes de Avión, Mapas de Ruta, Datos de Tráfico
4. **Datos de Eventos** \- Deportes, Festivales, Cumbres, etc.
5. **Datos de Esquemas Gubernamentales**
6. **Datos Turísticos** \- Entrada & Salida de Turistas, Lugar de Origen, Edad, Idiomas Hablados, etc.
7. **Datos Sociales** \- Publicaciones de los viajeros en las RRSS con geoetiquetas, Etiquetas de marca, Hashtags relacionados con viajes, etc.
8. **Noticias de viajes & Hostelería**
9. **Datos Diversos**: Búsqueda en la web, Visitas al sitio web, Aplicaciones solicitadas de pasaporte & Visa, Educación internacional y Empleo, etc.
# ¿Cómo extraer big data de viajes & turismo?
Para extraer big data de viajes & turismo, puedes preferir:
**1. Web-scraping**
El web scraping de turismo es un enfoque para extraer datos de manera programática de sitios web enfocados en viajes & turismo. Puedes utilizar [herramientas de raspado web](https://www.octoparse.es/) o escribir tus scripts para extraer datos de estos sitios web. Por ejemplo, supongamos que necesitas datos de hoteles & restaurantes, es decir, precios, reseñas, servicios prestados, ubicación, etc., de "Booking.com". [Puedes extraer fácilmente cientos y miles de datos de hoteles & restaurantes de Booking.com en unos pocos minutos](https://helpcenter.octoparse.com/hc/en-us/articles/360018559132-Scrape-hotel-data-from-Booking?__hstc=97730752.cfa4be011358393efe2a4d1b0e579f03.1626416084584.1627455840887.1627462306621.27&__hssc=97730752.4.1627462306621&__hsfp=2714634594).
Utiliza herramientas de raspado web si lo necesitas
* Para extraer rápidamente big data de turismo
* Una solución asequible, escalable, robusta y segura
**2. Recursos internos & externos**
* También puedes obtener los datos dentro de la empresa. Buscar datos valiosos en tu software de gestión de reservas, servicios & libros de cuentas, etc.
* También podrías solicitar a las juntas de turismo y a los proveedores de datos de terceros que te proporcionen los datos. Pero la veracidad de los datos es mucho mayor si extraes mediante web scraping.
# ¿Cuáles son los casos de uso de Big Data en la industria turística?
La clave para revivir el turismo & otros sectores emparentados es aprovechar el poder de la cultura y la creatividad, optimizar las operaciones de viajes, personalizar las ofertas, brindar experiencias perfectas, y desbloquear nuevos canales de crecimiento. Big data puede ser de ayuda aquí:
**1. Investigación de Mercado Turístico**
En la última década, la penetración de los dispositivos móviles ha catalizado positivamente la cantidad de datos producidos por diferentes fuentes. En el sector del turismo, los datos se dividen en tres categorías:
1. ***Datos generados por el usuario:*** *datos textuales en línea como blogs, publicaciones en RRSS, comentarios, reseñas, geo-etiquetado comida, restaurante, datos de imágenes de destino.*
2. ***Datos del dispositivo:*** *cámaras CCTV en lugares turísticos, hoteles, restaurantes, datos de GPS móviles* *&* *vehículos, datos de itinerancia móvil, datos de Bluetooth, datos meteorológicos, etc .;*
3. ***Datos de transacciones:*** *datos de motores de búsqueda, datos de visitas a páginas web, datos de reservas online, etc.*
Todos estos datos pueden ser un buen alimento para que los investigadores lleven a cabo investigaciones relacionadas con el turismo. ¿Qué tipo de investigación? Se pueden analizar las tendencias turísticas, los hábitos y comportamientos turísticos, el impacto de ciertos esquemas / campañas en la afluencia de turistas, etc.
**2. Asignación racional de recursos de tráfico**
La identificación y predicción del estado del flujo de tráfico en tiempo real podría ser la respuesta para aliviar la congestión del tráfico en los puntos de atracción turística y en las rutas turísticas:
* Correlación Temporal: Los detectores de flujo de tráfico fijos y móviles pueden ayudar a recopilar y analizar datos de tráfico.
* Correlación Espacial: El flujo de tráfico en uno o más cruces impacta el flujo en los cruces vecinos.
* Correlación Histórica: Los datos de tráfico en torno al flujo, la velocidad y la ocupación tienen características similares basadas en el tiempo en determinados días de la semana o fines de semana.
* Los modelos de agrupación de tráfico, los modelos de fusión de datos, los modelos correlacionales y los modelos de optimización son los enfoques analíticos más comunes para explotar de manera eficiente el big data relacionado con el tráfico y manejar el tráfico de manera inteligente.
**3. Gestión operativa de atractivos turísticos mediante el análisis de comportamientos turísticos**
* El análisis del comportamiento de los turistas podría ser útil para diferentes partes interesadas.
* Estacionalidad de la demanda en viajes, turismo y hostelería para operaciones flexibles. El big data puede ser crucial para gestionar de forma rentable los aspectos operativos de las atracciones turísticas.
* Además de las agencias de gestión de operaciones de destino, puede ser útil para cadenas minoristas, cadenas de hoteles y restaurantes, etcétera.
* Los turistas tienen un alto potencial de gasto. Cuan...
ot6hj7
u_melisaxinyue
melisaxinyue
t3_ot6hj7
https://www.reddit.com/r/u_melisaxinyue/comments/ot6hj7/qué_es_el_big_data_en_el_turismo/
7/28/2021 9:30:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
¿Qué es el Big Data en el turismo?
False
1
ot6hj7
0
7400
5
5
1
0.0819672131147541
1
0.0819672131147541
0
0
717
58.7704918032787
1220
Red
10
Dash Dot Dot
20
No
844
Posted
11/13/2020 9:58:17 AM
El web scraping es difícil, por mucho que queramos reclamarlo como simple clic y búsqueda, esta no es toda la verdad. Bueno, piense en el tiempo, cuando no hemos tenido web scrapers como [Octoparse](https://www.octoparse.es/), [Parsehub](http://www.parsehub.com/) o [Mozenda](http://www.mozenda.com/), cualquier persona que carece de conocimientos de programación se ve obligada a dejar de usar tecnología intensiva como el web scraping. A pesar del tiempo que lleva aprender el software, podríamos llegar a apreciar más de lo que ofrecen todos estos programas "inteligentes", que han hecho posible el web scraping para todos.
**Por qué web scraping es defícil?**
https://preview.redd.it/wyay1nuwczy51.png?width=913&format=png&auto=webp&v=enabled&s=28f8988cb14c661e1ecb0c627a4ae6bf4820bc0c
* La codificación no es para todos
Aprender a codificar es interesante, pero solo si estás interesado. Para aquellos que carecen de la unidad o el tiempo para aprender, podría ser un obstáculo real para obtener datos de la web.
&#x200B;
* No todos los sitios web son iguales (aparentemente)
Los sitios cambian todo el tiempo, y el mantenimiento de los scrapers puede ser muy costoso y llevar mucho tiempo. Si bien el raspado de contenido HTML ordinario puede no ser tan difícil, sabemos que hay mucho más que eso. ¿Qué pasa con el scraping de archivos PDF, CSV o Excels?
&#x200B;
* Las páginas web están diseñadas para interactuar con los usuarios de muchas maneras innovadoras.
Los sitios que están hechos de Java Scripts complicados y mecanismos AJAX (que resultan ser la mayoría de los sitios populares que conoce) son difíciles de scrape. Además, los sitios que requieren credenciales de inicio de sesión para acceder a los datos o uno que ha cambiado dinámicamente los datos detrás de los formularios pueden crear un gran dolor de cabeza para los web scrapers.
&#x200B;
* Mecanismos antiarañazos (anti-scraping)
Con la creciente conciencia del web scraping, el scraping directo puede ser fácilmente reconocido por el robot y bloqueado. Captcha o acceso limitado a menudo ocurre con visitas frecuentes en poco tiempo. Las tácticas como la rotación de agentes de usuario, la modificación de direcciones IP y la conmutación de servidores proxy se utilizan para vencer los esquemas comunes contra el raspado. Además, agregar demoras en la descarga de la página o agregar acciones de navegación similares a las de los humanos también puede dar la impresión de que "usted no es un bot".
&#x200B;
* Se necesita un servidor "super"
Scraping algunas páginas y raspar a escala (como millones de páginas) son historias totalmente diferentes. El raspado a gran escala requerirá un sistema escalable con mecanismo de I/O, rastreo distribuido, comunicación, programación de tareas, verificación de duplicación, etc.
Obtenga más información sobre [qué es el web scraping](https://www.octoparse.com/blog/web-scraping-introduction) si está interesado.
**¿Cómo funciona un web scraper "automático"?**
La mayoría, si no todos, los web scrapers automáticos, descifran la estructura HTML de la página web. Al "decirle" al raspador lo que necesita con "arrastrar" y "hacer clic", el programa procede a "adivinar" qué datos puede obtener después de usar varios algoritmos, y finalmente busca el texto, HTML o URL de destino de la página web.
&#x200B;
https://preview.redd.it/rcjl0tcyczy51.png?width=811&format=png&auto=webp&v=enabled&s=fbf63483b2d8c779354be3840c94c91cff63af7f
**¿Debería considerar usar una herramienta de web scraping?**
No hay una respuesta perfecta para esta pregunta. Sin embargo, si se encuentra en cualquiera de las siguientes situaciones, puede consultar qué puede hacer una herramienta de raspado por usted,
1) no sé cómo codificar (y no tengo el deseo/el tiempo de profundizar)
2) cómodo usando un programa de computadora
3) tienen tiempo/presupuesto limitado
4) buscando scrape de muchos sitios web (y la lista cambia)
5) quiere scraping web continuamente
Si encaja en uno de los anteriores, aquí hay un par de artículos para ayudarlo a encontrar la herramienta de scraping que mejor satisfaga sus necesidades.
[Las 30 mejores herramientas gratuitas de web scraping](https://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping)
[Las 20 Mejores Herramientas de Web Scraping para Extracción de Datos](https://www.octoparse.es/blog/las-20-mejores-herramientas-de-web-scraping)
**Web scrapers para ser "más inteligentes"**
El mundo está progresando y también lo están todas las diferentes herramientas de raspado web. Recientemente realicé una investigación sobre varias herramientas de raspado, y estoy muy feliz de ver que cada vez más personas entienden y usan el raspado web.
[**Octoparse**](https://www.octoparse.es/) lanzó recientemente una [nueva versión beta ](https://www.octoparse.es/download)que introdujo un nuevo modo de plantilla para raspar usando plantillas preconstruidas. Muchos sitios populares como Amazon, Indeed, Booking, Trip Advisors, Twitter, YouTube y muchos más están cubiertos. Con el nuevo modo de Plantilla, se solicita a los usuarios que ingresen variables como palabras clave y ubicación, luego el raspador se encargará de recopilar datos del sitio web. Es una característica bastante interesante si hay una plantilla que desee y creo que el equipo de Octoparse también agrega constantemente nuevas plantillas.
https://preview.redd.it/npideswzczy51.png?width=817&format=png&auto=webp&v=enabled&s=a4008f703ddcd5b1a40047d44201b4746b0deae9
También se incluye en la versión beta una nueva función de URL que permite,
&#x200B;
1. Agregar hasta 1 millón de URL a cualquier tarea/crawler individual (Compare con las 20,000 URL anteriores)
2. Importar URL de lotes desde archivos locales u otra tarea
3. Genere URL que sigan un patrón predefinido, un ejemplo sencillo será uno que solo tenga cambios en el número de página.
4. Si tiene un trabajo que en realidad se dividió en dos, uno para extraer URL y otro para extraer datos específicos de esos URL extraídos, en la nueva versión beta ahora puede asociar las dos tareas directamente sin tener que "transferir" manualmente los URL de una tarea a otra.
**Mozenda** hizo importantes actualizaciones de características, como la comparación de datos en línea y los datos del agente móvil. Otras actualizaciones anteriores, como los bloqueadores de solicitudes y el secuenciador de trabajos, también pueden hacer que el proceso de raspado sea más eficiente.
&#x200B;
https://preview.redd.it/s2wyccq0dzy51.png?width=1799&format=png&auto=webp&v=enabled&s=2f219e853ec03ea7c60dc789128554ba4dc62470
**Dexi.io** presentó una función de activación que realiza acciones basadas en lo que ocurra en su cuenta de Dexi.io. Si tiene un trabajo complejo, vale la pena echarle un vistazo.
**Import.io** agregó dos nuevas característica. Estas pueden ser extremadamente útiles si las necesita: webhooks y etiquetado de extractor. Con webhooks, ahora puede recibir notificaciones en muchos programas de terceros como AWS, Zapier o Google Cloud tan pronto como se extraigan los datos para un trabajo.
El etiquetado extractor permite el etiquetado adicional a través de API y su objetivo es hacer que la integración y el almacenamiento de datos sean más fáciles y más eficientes. Solo un mes antes, Import.io había facilitado mucho la obtención de datos extranjeros al ofrecer Country Based Extractor. ¡Ahora puede obtener datos como si estuviera ubicado físicamente en otro país!
**Ejemplos de cómo se usa el web scraping**
&#x200B;
https://preview.redd.it/x9wm7nj1dzy51.png?width=394&format=png&auto=webp&v=enabled&s=99c9f7d9fe69b92f358367c6911dd68d6f5e4ee9
Con la nueva información que se agrega a la forma segundo a segundo, ¡las posibilidades son infinitas!
Recopilar listado de bienes inmuebles (Zillow, Realtor.com)
Recopile información de clientes potenciales, como correos electrónicos y teléfonos (Yelp, Yellowpages, etc.)
Scrape la información d...
jtehe7
u_melisaxinyue
melisaxinyue
t3_jtehe7
https://www.reddit.com/r/u_melisaxinyue/comments/jtehe7/hacer_más_fácil_el_web_scraping_técnica/
11/13/2020 9:58:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Hacer Más Fácil el Web Scraping Técnica
False
1
jtehe7
0
7400
5
5
2
0.171526586620926
1
0.0857632933104631
0
0
626
53.6878216123499
1166
Red
10
Dash Dot Dot
20
No
843
Posted
11/13/2020 10:16:44 AM
&#x200B;
https://preview.redd.it/58edn12agzy51.png?width=1357&format=png&auto=webp&v=enabled&s=94ce8965e443bdc803adaf9f1380fc4561328a2e
Según la [Organización Mundial del Turismo (OMT)](https://www.unwto.org/news), el número total de llegadas de turistas mundiales fue de casi 1,500 millones en 2019, creciendo en un notable 4% respecto al año anterior. La industria de viajes sigue siendo una de las industrias más competitivas dominadas por los servicios de alojamiento y transporte.
**¿Qué es el Herramienta de Web Datos Scraping del hotel?**
Un scraper de datos de hotel es un web scraping ([software de extracción de datos, araña web, web rastreador](https://www.octoparse.es/)) que puede extraer datos de hoteles y viajes de sitios web.
**¿Por qué necesitamos recopilar datos de hoteles y viajes?**
En la actualidad, hay muchos tipos de información sobre hoteles y hoteles en varias plataformas. Necesitamos recopilar e integrar estos datos antes de poder encontrar algunas características comunes y realizar análisis de datos. Por ejemplo, si están clasificados por grado hay cuántos hoteles, cuál es su ubicación geográfica, su precio promedio, su estilo de decoración, etc.
**¿Cuáles son algunos de los datos relacionados con el hotel que puede recopilar?**
&#x200B;
* Nombres de hotel
* Precios de las habitaciones
* Calificaciones
* Direcciones (por ejemplo, calle, ciudad, estado, país y código postal)
* Comodidades de hotel
* Descripciones
* Sitios web
* Números de teléfono/fax
* Ocupaciones
* Tipos de habitaciones
* Imágenes
...
En resumen, puede extraer casi cualquier información útil que vea en una página web!
**Fuentes de datos: ¿dónde puedes scrape los datos?**
Los sitios de reserva de hoteles incluyen TripAdvisor.com, Booking.com, Expedia.com, Trivago.com, Travelocity.com y Hotwire.com. Cada sitio web tiene toneladas de información sobre hoteles en todo el mundo.
**¿Por qué necesitas scrape los datos del hotel? A continuación hay algunos ejemplos de referencia.**
&#x200B;
* **Controle los precios de los hoteles o la calificación de los hoteles**
Saber lo que ofrecen sus competidores puede ayudarlo a mantenerse en la cima del juego, especialmente cuando la competencia es feroz como servicios de alojamiento. Tener los precios de las habitaciones ajustados y actualizados de manera oportuna es fundamental para la cifra final de ventas.
&#x200B;
* **Predecir tasa de ocupación**
Predecir cuándo el hotel tiene la tasa de ocupación más alta y más baja es vital para una estrategia de precios efectiva, especialmente durante las vacaciones.
&#x200B;
* **Gestión de marca: ¿qué dicen los clientes sobre usted o sus competidores?**
Tener reseñas y comentarios scraped y analizados puede ayudarlo a vigilar cómo se sienten los clientes hacia el hotel y los servicios ofrecidos.
&#x200B;
* Consigue las mejores ofertas de hotel
* Desarrollar una estrategia de marketing efectiva.
* Crear clientes
* ...
&#x200B;
**¿Cómo podemos extraer los datos del hotel de manera eficiente?**
Los web scrapers automáticos, como [Octoparse](https://www.octoparse.es/), Dexi.io, Parsehub e Import.io pueden ser una opción inteligente si no es un usuario técnico pero quiere scrape datos a un bajo costo.
&#x200B;
* Sin codificación en absoluto
* Fácil de usar
* Económicoc
**Solo tres pasos. ¡Construyamos un scraper de hotel desde cero!**
Tome un software automático de web scraping llamado[ Octoparse](https://www.octoparse.es/), por ejemplo. Ya existe una sensilla plantilla de web scraping de Booking.com en Octoparse navegador incorporado que le permite usalo conveniente directamente.
Pero si desea diseñar propio scraper con Octoparse y personalizar los campos de extracción, siga los tres pasos a continuación.
**Paso 1. Scrape los datos del hotel de todas las páginas**
En este artículo, extraeré la siguiente información de [Booking.com](https://www.booking.com/):
&#x200B;
* Nombre del hotel
* Precio
* Dirreción
* Clasificación
* URL de la imagen del hotel
Primero, cargue la página web de destino en el navegador integrado de Octoparse. Para recopilar de todas las páginas disponibles, haga clic en el botón de la página siguiente ("**>**") y luego seleccione "**loop click the selected link**" en el menú Action Tips. Ahora, el rastreador tiene instrucciones de pasar por todas las páginas disponibles durante el proceso de scraping.
&#x200B;
https://preview.redd.it/cj0aaxmbgzy51.png?width=1191&format=png&auto=webp&v=enabled&s=e2747a06cf136732c98ad2afae9570fd1b152f9a
**Paso 2. Haz clic en la página de detalles de cada hotel.**
Haga clic en el título del hotel en la página de listado uno por uno hasta que se seleccionen todos los títulos (los elementos seleccionados se resaltarán en verde), luego seleccione "Loop click each element" en el menú Action Tips. Octoparse hace clic en todos los listados disponibles de la página. Luego, llega a la página de detalles del hotel.
&#x200B;
https://preview.redd.it/bxfh5x4cgzy51.png?width=1191&format=png&auto=webp&v=enabled&s=1b1d5d0bdf9d335de60b16da78e352584f938516
**Paso 3. Seleccione los datos que necesita para la extracción.**
Haga clic en los campos de datos que necesita (es decir, el nombre del hotel, la calificación y la dirección se seleccionan en el ejemplo).
&#x200B;
https://preview.redd.it/7jrj5tpcgzy51.png?width=1172&format=png&auto=webp&v=enabled&s=71d63242e50d930dff08663166dee00383d4e06f
¡Felicidades! ¡Ya casi has llegado! ¡Todo lo que necesita hacer a continuación es ejecutar la tarea!
&#x200B;
https://preview.redd.it/xcvojcadgzy51.png?width=794&format=png&auto=webp&v=enabled&s=4608aa6c7949a6599953246b6776281bbfb56806
Para obtener más información sobre cómo scrape datos de Booking.com, puede consultar este [tutorial paso a paso](http://octoparse.es/case-tutorials), y también puede ver cómo [scrape los datos del hotel de Tripadvisor](https://www.octoparse.es/tutorial-7/scrape-hotel-data-from-tripadvisor), [extraer los datos del hotel de Booking](https://helpcenter.octoparse.com/hc/en-us/articles/360018559132-Scrape-hotel-data-from-Booking) y [cómo scrape los datos de Airbnb](https://www.octoparse.com/tutorial-7/scrape-room-data-from-airbnb).
jteo4b
u_melisaxinyue
melisaxinyue
t3_jteo4b
https://www.reddit.com/r/u_melisaxinyue/comments/jteo4b/cómo_construir_un_scraper_de_hotel_datos/
11/13/2020 10:16:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Cómo Construir un Scraper de Hotel Datos
False
1
jteo4b
0
7400
5
5
0
0
0
0
0
0
486
55.9907834101382
868
Red
10
Dash Dot Dot
20
No
842
Posted
11/13/2020 9:55:27 AM
&#x200B;
https://preview.redd.it/8y657gvhczy51.png?width=1350&format=png&auto=webp&v=enabled&s=d8887ba42fafc4c6187c93f96a7909800d0cabf1
Fotografiado por [Ian Schneider](https://unsplash.com/@goian) en [Unsplash](https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText)
Con el advenimiento de los grandes datos, las personas comienzan a obtener datos de Internet para el [análisis de datos](https://www.octoparse.es/blog/30-mejores-herramientas-de-big-data-para-datos-analisis) con la ayuda de [rastreadores web](https://www.octoparse.es/DataCrawler). Hay varias formas de hacer su propio rastreador: extensiones en los navegadores, codificación de python con [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) o Scrapy, y también [herramientas de extracción de datos](https://www.octoparse.es/blog/las-20-mejores-herramientas-de-web-scraping) como [Octoparse](https://www.octoparse.es/).
Sin embargo, siempre hay una guerra de codificación entre las arañas y los anti-bots. Los desarrolladores web aplican diferentes tipos de técnicas anti-scraping para evitar que sus sitios web sean raspados. En este artículo, he enumerado las cinco técnicas anti-scraping más comunes y cómo se pueden evitar.
**1.IP**
Una de las formas más fáciles para que un sitio web detecte actividades de web scraping es a través del seguimiento de IP. El sitio web podría identificar si la IP es un robot en función de sus comportamientos. Cuando un sitio web descubre que se ha enviado una **cantidad abrumadora** de solicitudes desde **una sola dirección** IP **periódicamente** o en **un corto período de tiempo**, existe una buena posibilidad de que la IP se bloquee porque se sospecha que es un bot. En este caso, lo que realmente importa para construir un crawler anti-scraping es el **número** y la **frecuencia** de visitas por unidad de tiempo. Aquí hay algunos escenarios que puede encontrar.
**Escenario 1**: Hacer múltiples visitas en segundos. No hay forma de que un humano real pueda navegar tan rápido. Entonces, si su crawler envía solicitudes frecuentes a un sitio web, el sitio web definitivamente bloquearía la IP para identificarlo como un robot.
**Solución:** Disminuya la velocidad de scraping. Configurar un tiempo de retraso (por ejemplo, la función "dormir") antes de ejecutar o aumentar el tiempo de espera entre dos pasos siempre funcionaría.
**Escenario 2**: Visitar un sitio web exactamente al mismo ritmo. El humano real no repite los mismos patrones de comportamiento una y otra vez. Algunos sitios web monitorean la frecuencia de las solicitudes y si las solicitudes se envían periódicamente con el mismo patrón exacto, como una vez por segundo, es muy probable que se active el mecanismo anti-scraping.
**Solución**: Establezca un tiempo de retraso aleatorio para cada paso de su rastreador. Con una velocidad de scrapubg aleatoria, el rastreador se comportaría más como los humanos navegan por un sitio web.
**Escenario 3:** Algunas técnicas anti-scraping de alto nivel incorporarían algoritmos complejos para rastrear las solicitudes de diferentes IP y analizar sus solicitudes promedio. Si la solicitud de una IP es inusual, como enviar la misma cantidad de solicitudes o visitar el mismo sitio web a la misma hora todos los días, se bloquearía.
**Solución:** Cambie su IP periódicamente. La mayoría de los servicios VPN, [cloud servers](https://www.octoparse.es/tutorial-7/cloud-extraction) y servicios proxy podrían proporcionar IP rotadas. Al través una solicitud Rotación de IP, el rastreador no se comporta como un bot, lo que reduce el riesgo de ser bloqueado.
**2.**[**Captcha**](https://en.wikipedia.org/wiki/CAPTCHA)
¿Alguna vez has visto este tipo de imagen al navegar por un sitio web?
1.Necesita un clic
&#x200B;
https://preview.redd.it/f7of57uiczy51.png?width=293&format=png&auto=webp&v=enabled&s=d1fb8a3478f650c1f64172a64da3e7768f78bc15
2.Necesita seleccionar imágenes específicas
&#x200B;
https://preview.redd.it/2bd7ka4kczy51.png?width=349&format=png&auto=webp&v=enabled&s=ca386cb6c3be8ca78bfff75d54a021e93ab6d654
Estas imágenes se llaman Captcha. Captcha significa prueba de Turing pública completamente automatizada para diferenciar a computadoras y seres humanos. Es un programa público automático para determinar si el usuario es un humano o un robot. Este programa proporcionaría varios desafíos, como imagen degradada, rellenar espacios en blanco o incluso ecuaciones, que se dice que son resueltas solo por un humano.
Esta prueba ha evolucionado durante mucho tiempo y actualmente muchos sitios web aplican Captcha como técnicas anti-scraping. Alguna vez fue muy difícil pasar Captcha directamente. Pero hoy en día, muchas [herramientas de código abierto](https://tympanus.net/codrops/2009/09/22/21-free-captcha-sources/) ahora se pueden aplicar para resolver problemas de Captcha, aunque pueden requerir habilidades de programación más avanzadas. Algunas personas incluso crean sus propias bibliotecas de características y crean técnicas de reconocimiento de imágenes con aprendizaje automático o habilidades de aprendizaje profundo para pasar esta verificación.
**Es más fácil no activarlo que resolverlo**
Para la mayoría de las personas, la forma más fácil es ralentizar o aleatorizar el proceso de extracción para no activar la prueba Captcha. Ajustar el tiempo de retraso o usar IP rotados puede reducir efectivamente la probabilidad de activar la prueba.
**3.Iniciar Sesión**
Muchos sitios web, especialmente las plataformas de redes sociales como Twitter y Facebook, solo le muestran información después de iniciar sesión en el sitio web. Para rastrear sitios como estos, los rastreadores también necesitarían simular los pasos de registro.
Después de iniciar sesión en el sitio web, el rastreador debe guardar las cookies. Una cookie es un pequeño dato que almacena los datos de navegación para los usuarios. Sin las cookies, el sitio web olvidaría que ya ha iniciado sesión y le pedirá que vuelva a iniciar sesión.
Además, algunos sitios web con mecanismos de raspado estrictos solo pueden permitir el acceso parcial a los datos, como 1000 líneas de datos todos los días, incluso después de iniciar sesión.
**Tu bot necesita saber cómo iniciar sesión**
1) Simular operaciones de teclado y mouse. El rastreador debe simular el proceso de inicio de sesión, que incluye pasos como hacer clic en el cuadro de texto y los botones "iniciar sesión" con el mouse, o escribir información de cuenta y contraseña con el teclado.
2) Inicie sesión primero y luego guarde las cookies. Para los sitios web que permiten cookies, recordarían a los usuarios guardando sus [cookies](https://en.wikipedia.org/wiki/HTTP_cookie). Con estas cookies, no es necesario volver a iniciar sesión en el sitio web a corto plazo. Gracias a este mecanismo, su rastreador podría evitar tediosos pasos de inicio de sesión y raspar la información que necesita.
3) Si, desafortunadamente, encuentra los mecanismos de escalado estrictos anteriores, puede programar su rastreador para monitorear el sitio web a una frecuencia fija, como una vez al día. Programe el rastreador para que raspe las 1000 líneas de datos más recientes en períodos y acumule los datos más nuevos.
**4.UA**
UA significa [User-Agent](https://www.howtogeek.com/114937/htg-explains-whats-a-browser-user-agent/), que es un encabezado del sitio web para identificar cómo visita el usuario. Contiene información como el sistema operativo y su versión, tipo de CPU, navegador y su versión, idioma del navegador, un complemento del navegador, etc.
Un ejemplo de UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10\_7\_0) AppleWebKit/535.11 (KHTML, como Gecko) Chrome/17.0.963.56 Safari/535.11
Al scrape un sitio web, si su rastreador no contiene encabezados, solo se identificaría como un script (por ejemplo, si usa python para construir el rastreador, se declararía como un script de python). Los sitios web definitivamente bloquearían la solicitud de un script. En este caso, el b...
jtegdr
u_melisaxinyue
melisaxinyue
t3_jtegdr
https://www.reddit.com/r/u_melisaxinyue/comments/jtegdr/5_técnicas_antiscraping_que_puedes_encontrar/
11/13/2020 9:55:27 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
5 Técnicas Anti-Scraping que Puedes Encontrar
False
1
jtegdr
0
7400
5
5
2
0.164609053497942
3
0.246913580246914
0
0
672
55.3086419753086
1215
Red
10
Dash Dot Dot
20
No
841
Posted
8/18/2021 6:40:18 AM
[El web scraping](https://www.octoparse.es/) es una forma de extraer datos de la web mediante herramientas y tecnologías de automatización. Anteriormente, las empresas eran muy informales con la recopilación de datos web. Pero con el inicio de [las regulaciones RGPD (o GDPR)](https://www.legislation.gov.uk/eur/2016/679/contents), la debida diligencia con respecto a la extracción de datos es imprescindible.
Recientemente, Polonia impuso una [multa de 220.000 euros](https://www.achievedcompliance.com/poland-imposes-fines-for-web-scraping-of-personal-data-when-notification-to-individuals-did-not-occur/) a una organización que recopiló datos de alrededor de 7 millones de personas, pero no les informó (informar a las personas es una regla según el artículo 14 del RGPD). Además, hace unos meses, la DPA francesa emitió una guía relacionada con el web scraping comercial. Entonces, pensamos en explicar qué significa GDPR y por qué es importante para la comunidad de scraping. Lee este artículo para saber todo lo que necesitas para **cumplir con el RGPD** mientras raspando la web.
**Tabla de contenidos**
* [¿Cuándo entra en juego el RGPD?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h1)
* [¿Qué califica como información de identificación personal (PII)?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h2)
* [¿Estás raspando la información personal de los ciudadanos de la UE?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h3)
* [¿Tienes una base legal para raspar datos personales?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h4)
* [¿Qué puedes hacer para cumplir con el RGPD?](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h5)
* [Conclusión](https://www.octoparse.es/blog/cumplimiento-de-rgpd-en-web-scraping#h6)
## ¿Cuándo entra en juego el RGPD?
Primero, echamos un vistazo a lo que se puede extraer de la web y, luego, analizamos qué tipo de datos se incluyen en el RGPD y cuáles no.
Puedes raspar:
* Anuncios inmobiliarios para realizar marketing personalizado,
* índices accionarios, portales de noticias para la inteligencia de mercado,
* Publicaciones de trabajo para impulsar tus servicios de RR.HH,
* Sitios de redes sociales para analizar los sentimientos de los clientes,
* Directorios online para prospección,
* Datos públicos de sitios web gubernamentales para obtener perspectivas,
* Datos de productos de sitios de comercio electrónico para seguimiento de la competencia e inteligencia de precios,
* Blogs, videos y todo eso.
Por supuesto, los casos de uso de la [extracción de datos](https://helpcenter.octoparse.es/hc/es/articles/360055954154-Extraer-datos) no se limitan a estos, sino que a un nivel amplio, esto te da una idea sobre los diferentes tipos de datos que puedes extraer. Ahora, **RGPD**, que significa **Reglamento General de Protección de Datos (UE) 2016/679**, es una ley en la Unión Europea (UE) sobre protección de datos y privacidad de todas las personas dentro de la UE y el EEE. GDPR tiene dos propósitos:
* Pone a las personas en control de cómo se utilizan sus datos
* Simplifica el entorno regulatorio para las empresas que operan en la región de la UE
La pregunta es, ¿en qué terreno se cruzan el raspado de datos y el RGPD? ¿Cuándo deberías preocuparte por el RGPD? Una respuesta corta sería, cada vez que se extraigas la información personal de un individuo / ciudadano que resida en la UE.
Para saber si necesitas cumplir con GDPR o no, y para asegurarte de que tu proyecto de raspado cumpla con GDPR, encuentra las respuestas a las siguientes preguntas:
* ¿Qué califica como información de identificación personal (PII)?
* ¿Estás raspando la información personal de los ciudadanos de la UE?
* ¿Tienes una base legal para raspar datos personales?
* ¿Qué puedes hacer para cumplir con el RGPD?
[Original Image](https://preview.redd.it/6q4otfh1b2i71.png?width=1600&format=png&auto=webp&v=enabled&s=06acea9a7921218a9a63abd61d4fe5bc61ed9b2e)
# ¿Qué califica como información de identificación personal (PII)?
## Cualquier dato que pueda ayudar a alguien a rastrear o identificar a una persona calificaría como PII. Algunos ejemplos pueden ser:
* Nombre
* Email
* Números de Contacto
* Direccion postal
* Detalles de la tarjeta de crédito
* Detalles del banco
* Dirección IP
* DOB
* Imagen / video / audio de la persona
* Informes médicos
* Detalles de empleo, etc.,
## ¿Estás raspando la información personal de los ciudadanos de la UE?
El RGPD se ocupa estrictamente de la información de identificación personal de las personas dentro de la Unión Europea y el Espacio Económico Europeo (EEE). Entonces, la siguiente pregunta que surge es **¿estás raspando datos de ciudadanos europeos?** Si la respuesta es un "No", entonces estás a salvo. Por lo tanto, digamos que si estás extrayendo datos que conciernen a India, EE. UU. O Australia, no debes preocuparte por el RGPD. En su lugar, debes buscar leyes de protección de datos dentro de su jurisdicción respectiva. La jurisdicción de RGPD se limita al EEE. Si tus proyectos de raspado necesitan que raspes la PII de los ciudadanos de la UE, debes tener una base legal para hacerlo.
## ¿Tienes una base legal para raspar datos personales?
Las bases legales se establecen en [el artículo 6 del RGPD](https://www.legislation.gov.uk/eur/2016/679/article/6), y existen seis bases legales para el procesamiento de datos extraídos:
1. Consentimiento
Esta puede ser tu base legal cuando las personas, de las que estás extrayendo datos, te han dado su consentimiento para extraer sus datos para fines específicos.
2. Contrato
El contrato con las personas interesadas puede tener una base legal bajo RGPD si el contrato necesariamente requiere que tú proceses los datos.
3. Obligación Legal
El tercer tipo de base legal podría ser si el procesamiento de datos es necesario para que tú cumpla con una obligación legal.
4. Intereses Vitales
Puedes argumentar que *Intereses Vitales* es la base legal para tu proyecto de raspado si está destinado a salvar la vida de alguien.
5. Tareas Públicas
Cuando el tratamiento de los datos se realice por interés público o para el desempeño de tus funciones como funcionario, se contará como base jurídica.
6. Interés Legítimo
Si el procesamiento de datos es necesario para el interés legítimo del controlador de datos, también puedes contarlo como una base del procesamiento legal de datos bajo RGPD. Pero esta no será la base legal si anula los derechos o intereses fundamentales de una persona cuyos datos se recopilan y procesan.
En resumen, consentimiento y contrato son más o menos lo mismo. Si las personas te han dado su consentimiento, está bien procesar sus datos. ¿Cuándo será aplicable? Tomamos un ejemplo. Supongamos que existe un sitio web de venta minorista de moda que recopila reseñas de productos de los compradores, así como la PII del comprador, y la pone a disposición del público en las reseñas. La PII podría ser la edad, el nombre y la ubicación. Los datos generales serían el texto de revisión y el tiempo. Ahora, si necesitas raspar solo el texto de revisión para la investigación para impulsar el desarrollo de tu nuevo producto, entonces no debes preocuparte por RGPD. Pero si también estás raspando el nombre, la edad, la ubicación y otros detalles, entonces estás ingresando a la zona de PII y debes cumplir con RGPD para abordar el cumplimiento legal.
Los intereses vitales, las tareas públicas y las obligaciones legales rara vez formarían tus bases legales, ya que son conceptos claros y no hay mucho espacio para argumentos teóricos. Pero el interés legítimo podría ser tu base legal sólida si estás haciendo raspado web. Pero para la mayoría de las empresas, afirmar que esto también es un desafío.
El caso de [HiQ vs Linkedin](https://es.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn) también es una lectura interesante.
## ¿Qué puedes hacer para cumplir con el RGPD?
Aquí hay una lista de verificación para que te asegures de que tu proyecto de procesamiento de datos y raspado cumpla con el RGPD:
* **Mantener alejado de la interpretación incorrecta de los artículos en ...
p6lsm1
u_melisaxinyue
melisaxinyue
t3_p6lsm1
https://www.reddit.com/r/u_melisaxinyue/comments/p6lsm1/cumplimiento_de_rgpd_en_web_scraping/
8/18/2021 6:40:18 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Cumplimiento de RGPD en web scraping
False
1
p6lsm1
0
7400
5
5
Red
10
Dash Dot Dot
20
No
840
Posted
11/13/2020 10:23:36 AM
[removed]
jteqly
u_melisaxinyue
melisaxinyue
t3_jteqly
https://www.reddit.com/r/u_melisaxinyue/comments/jteqly/scraper_idealista_octoparse/
11/13/2020 10:23:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraper idealista: Octoparse
False
1
jteqly
0
7400
5
5
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
839
Posted
10/27/2021 4:17:53 AM
En el trabajo de [análisis de datos](https://es.wikipedia.org/wiki/An%C3%A1lisis_de_datos), hay un paso que nunca se puede omitir. Desempeña un papel vital en todo el trabajo de análisis de datos, pero a menudo se pasa por alto, es decir, la [**Limpieza de Datos**](https://es.wikipedia.org/wiki/Limpieza_de_datos). Cuando se trata de la limpieza de datos, muchas personas tienen una serie de preguntas en mente: ¿Qué es la limpieza de datos? ¿Qué necesita exactamente la limpieza de datos para lavar? ¿Cuáles son los pasos de la limpieza de datos? Ahora exploraré contigo uno por uno.
## ¿Qué es la limpieza de datos?
La limpieza de datos se refiere a la duplicación. El exceso de datos se filtra y elimina, los datos faltantes se complementan por completo, los datos erróneos se corrigen o eliminan y, finalmente, se clasifican en datos que podemos procesar y utilizar más adelante.
## ¿Qué debería eliminarse exactamente en la limpieza de datos?
Por definición, la limpieza de datos es para limpiar datos sucios, entonces, ¿qué datos se denominarán [datos sucios](https://www.tableau.com/es-es/learn/whitepapers/costs-of-dirty-data)? En el análisis de datos, a menudo necesitamos extraer algunos datos de la base de datos, pero debido a que la base de datos suele ser una colección de datos para un tema determinado, y estos datos se extraen de múltiples sistemas comerciales, inevitablemente contiene datos incompletos. Los datos incorrectos son muy repetitivos y estos datos se denominan datos sucios.
¿Cuál es la importancia de la limpieza de datos? La limpieza de datos tiene como objetivo mejorar la calidad de los datos y reducir la tasa de error en el proceso de estadísticas de datos. Antes del análisis de datos, necesitamos realizar la limpieza de datos con la ayuda de una computadora, que incluye principalmente la limpieza del rango efectivo de datos, la limpieza de la coherencia lógica de los datos y la verificación al azar de la calidad de los datos.
## Pasos de limpieza de datos
Echamos un vistazo a la ruta principal de limpieza de datos, como se muestra en la figura:
https://preview.redd.it/a1d7tjkj5xv71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=077185c549e4abb781958559e3ed623a8c4cd48a
### 1. Limpiar los valores perdidos
Los valores perdidos son el problema de datos más común y hay muchas formas de lidiar con los valores perdidos. Necesitamos seguir los pasos. La primera es determinar el rango de valores perdidos: calcular la proporción de valores perdidos para cada campo y luego formular estrategias basadas en la proporción de valores perdidos y la importancia del campo.
### 2. Eliminar los campos innecesarios
La operación de eliminar campos innecesarios es muy simple y se puede eliminar directamente. Pero lo que hay que recordar es que para limpiar los datos, se debe realizar una copia de seguridad de cada paso o probarlo con éxito en datos a pequeña escala, y luego procesar la cantidad completa de datos. Si borra los datos incorrectos, te arrepentirás.
### 3. Completar el contenido que falta
Esto se debe a que hay tres formas de completar algunos valores perdidos, es decir, de completar los valores perdidos según el conocimiento o la experiencia empresarial. Completar los valores faltantes con los resultados del cálculo del mismo indicador.
### 4. Volver a tomar el número
Debido a que ciertos indicadores son muy importantes y la tasa de faltas es alta, es necesario saber si el personal de acceso o el personal de negocios tienen otros canales para obtener datos relevantes. Este es el paso de limpiar los valores perdidos.
### 5. Verificación de relevancia
Si tus datos tienen varias fuentes, debes verificar la relevancia.
[Octoparse](https://www.octoparse.es/) ofrece opciones de limpieza de datos para convertir los datos extraídos en el formato que necesitas, puede [refinar los datos extraídos](https://helpcenter.octoparse.es/hc/es/articles/360056620474-Refinar-los-datos-extra%C3%ADdos-reemplazar-el-contenido-agregar-un-prefijo-) (reemplazar el contenido, agregar un prefijo, ..) mientras realizas el raspado web.
qgob15
u_melisaxinyue
melisaxinyue
t3_qgob15
https://www.reddit.com/r/u_melisaxinyue/comments/qgob15/comprender_los_3_problemas_principales_sobre_la/
10/27/2021 4:17:53 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Comprender los 3 Problemas Principales sobre la Limpieza de Datos
False
1
qgob15
0
7400
5
5
Red
10
Dash Dot Dot
20
No
838
Posted
7/5/2021 1:46:35 AM
Una de las aplicaciones de Octoparse es el monitoreo de precios. La plantilla para raspar Yahoo! ¡ Finance se admiten recientemente en Octoparse! Consulta este artículo para obtener la forma más sencilla de extraer los precios de las acciones de Yahoo! Finance.
[https://www.octoparse.es/blog/como-extraer-y-monitorear-los-precios-de-las-acciones-de-yahoo-finanzas](https://www.octoparse.es/blog/como-extraer-y-monitorear-los-precios-de-las-acciones-de-yahoo-finanzas)
https://preview.redd.it/vpxvq40mua971.png?width=2000&format=png&auto=webp&v=enabled&s=856edfbbb2c5fd21450632a1dc98280d87850d30
odxgt3
u_melisaxinyue
melisaxinyue
t3_odxgt3
https://www.reddit.com/r/u_melisaxinyue/comments/odxgt3/cómo_extraer_y_monitorear_los_precios_de_las/
7/5/2021 1:46:35 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Cómo extraer y monitorear los precios de las acciones de Yahoo! Finanzas
False
1
odxgt3
0
7400
5
5
0
0
0
0
0
0
37
49.3333333333333
75
Red
10
Dash Dot Dot
20
No
837
Posted
7/19/2021 3:49:21 AM
https://www.octoparse.es/blog/octoparse-como-solucionador-de-problemas-para-el-raspado-de-datos
on6821
u_melisaxinyue
melisaxinyue
t3_on6821
https://www.reddit.com/r/u_melisaxinyue/comments/on6821/octoparse_como_solucionador_de_problemas_para/
7/19/2021 3:49:21 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse como solucionador de problemas para data scraping
False
1
on6821
0
7400
5
5
Red
10
Dash Dot Dot
20
No
836
Posted
8/9/2021 3:55:06 AM
SEO (optimización de motor de búsqueda) es el proceso de afectar la visibilidad (el posicionamiento) de un sitio web o una página web en Internet, dentro de los resultados orgánicos en los motores de búsqueda como, por ejemplo, Google. Con ese fin, los buscadores recogen el listado de páginas que hay en la web y lo ordenan en función de su algoritmo.De hecho, esta es la forma gratuita de mejorar tu ranking de Google y atraer más tráfico.
https://preview.redd.it/r5c1ood799g71.png?width=700&format=png&auto=webp&v=enabled&s=c14a5ae433f14c2de0f7f362a499516f91271767
**Tabla de contenidos**
[Mejora de SEO & ranking de Google](https://www.octoparse.es/blog/una-forma-facil-y-gratuita-para-mejorar-tu-ranking-de-google#h1)
[Investigación de palabras clave](https://www.octoparse.es/blog/una-forma-facil-y-gratuita-para-mejorar-tu-ranking-de-google#h2)
[Investigación de Backlinks](https://www.octoparse.es/blog/una-forma-facil-y-gratuita-para-mejorar-tu-ranking-de-google#h3)
## Mejora de SEO & ranking de Google
Un estudio de Infront Webworks mostró que la primera página de Google recibe el 95% del tráfico web web, y las páginas siguientes reciben un 5% o menos del tráfico total. Entonces, para la mayoría de las personas, especialmente para aquellos que desean comenzar su negocio con fondos limitados, el SEO (optimización de motores de búsqueda) es una buena manera de mejorar el ranking de Google para mostrar sus sitios web y atraer a más personas a los sitios web a un costo relativamente bajo.
Sin embargo, el SEO es una gran cosa con muchos factores que afectarían el ranking de Google, como:
* **Factores en la página:** palabra clave en la etiqueta del título, palabra clave en la etiqueta H1, descripción, longitud del contenido, etc.
* **Factores del sitio:** mapa del sitio, confianza del dominio, ubicación del servidor, etc.
* **Factores fuera de la página:** el número de dominios de enlace, la autoridad de dominio de la página de enlace, la autoridad del dominio de enlace, etc.
* **Factores de dominio:** duración del registro del dominio, historial del dominio, etc.
(Nota: Para obtener más detalles, puedes consultar [Los 30 Factores De Ranking De Google Más Importantes Que Un Principiante Debe Saber)](https://unamo.com/blog/seo/30-important-google-ranking-factors-beginner-know)
**La mayoría de estos factores se pueden investigar con herramientas de raspado web de forma gratuita** (consulta [Los 30 Mejores Software Gratuitos de Web Scraping en 2021](https://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping) para obtener más información). Y con suficiente información, podrías desarrollar una mejor estrategia para mejorar tu ranking de Google.
So in this post I would only focus on the keyword and backlinks research to show you how to identify projected traffic and ultimately how to determine value of that ranking in a free and easy way if you don’t have any ideas.
Entonces, en este post solo me centraría en la investigación de palabras clave y backlinks para mostrarte cómo identificar el tráfico proyectado y, en última instancia, cómo determinar el valor de ese ranking de una manera fácil y gratuita si no tienes ninguna idea.
## Investigación de palabras clave
Apuesto a que dirías: "Ah, es fácil. Ya sabes, hay muchas herramientas de búsqueda de palabras clave, como [Keyword Planner](https://www.google.com.hk/intl/en/adwords/?channel=ha-ef&sourceid=awo&subid=hk-en-ha-rhef-skhp0~200331725223&gclid=Cj0KEQjwz_TMBRD0jY-RusGilOYBEiQAN-TuFKvM3258DNURsErkKrAwxzhqdW7kGeSDwae1nDWiZJwaAsHq8P8HAQ&dclid=CI_As8H67tUCFcWlvQodl8wADQ), [Buzzsumo](https://app.buzzsumo.com/research/most-shared), por ejemplo. Todos ellos podrían ayudarme a encontrar las palabras clave más valiosas para orientarme con SEO ".
Sí, está correcto. Pero, ¿cómo podrías juzgar el valor de las palabras clave? ¿Cómo sabes que recibes el tipo de visitantes adecuado?
La respuesta es investigar la demanda de palabras clave de tu mercado, predecir cambios en la demanda y producir contenido que los buscadores web estén buscando activamente. Las herramientas mencionadas anteriormente solo nos mostrarían las palabras clave que los visitantes suelen escribir en los motores de búsqueda. Sin embargo, no pueden mostrarnos directamente lo valioso que es recibir tráfico de esas búsquedas. Para comprender el valor de una palabra clave, debemos comprender nuestros propios sitios web, formular algunas hipótesis, probar y repetir—la fórmula clásica de marketing web. Aquí te mostraría cómo funciona.
Por ejemplo, suponga que has elegido algunas palabras clave específicas y has producido algunos contenidos antes, ahora necesitas medir los efectos. Es decir, cuando los buscadores utilizan estas palabras clave relevantes al realizar búsquedas en Google, encontrarán tu sitio web y accederán a él. Es por eso que primero necesitas conocer tu ranking. Tomaré [Octoparse](https://www.octoparse.es/) por ejemplo, para ilustrar eso.
¿Cómo puedo saber el ranking del dominio Octoparse cuando busco las dos palabras clave relevantes "herramienta gratuita de raspado web" y "servicio gratuito de raspado web"? ¿Y cómo podría conocer el rango de otra información detallada antes de Octoparse para poder conocer mejor el valor de las palabras clave buscadas?
La respuesta es la herramienta de raspado web. Con [la herramienta de raspado web Octoparse](https://www.octoparse.es/), puedes raspar fácilmente la información que deseas buscando las palabras clave (consulta [¿Cómo capturar los términos de búsqueda ingresados y el resultado?](https://www.octoparse.es/tutorial-7/ingresar-una-lista-de-palabras-clave-y-scrape-resultados) Para obtener más detalles).
A continuación se muestra el resultado que obtuve con [Servicio de Nube de Octoparse](https://www.octoparse.es/tutorial-7/que-es-la-extraccion-de-nubes).
https://preview.redd.it/dlv2vfxe99g71.png?width=1238&format=png&auto=webp&v=enabled&s=07d9ebb089ef18e27333cf2dd22a18a3ce941aef
Exporto los datos extraídos a Excel y analizo los datos. Lamentablemente, no encontré el dominio de Octoparse en Excel, aunque encuentro que la mayoría de los visitantes llegan a mi sitio web al buscar estas dos palabras clave a través de la información de análisis de [Google Search Console](https://www.google.com/webmasters/tools/). Ese es el problema con el que se encontraría la mayoría de la gente, pero a menudo lo ignoraban sin darse cuenta. Por lo tanto, es necesario verificar tu ranking con frecuencia y ajustar las estrategias en consecuencia si deseas mejorar tu ranking de Google.
Por ejemplo, para mí, necesito verificar el dominio de estos sitios web y tratar de averiguar si tu Page Rank es más alto que el mío. En caso afirmativo, ¿podrían mis contenidos ser de mayor calidad? Si no es así, ¿qué otros factores podrían optimizarse para mejorar el ranking?
Este es un ejemplo simple que muestra que el uso de la herramienta de raspado web para SEO podría brindarte información valiosa sobre lo difícil que sería clasificar para la palabra clave dada, y también la competencia.
## Investigación de Backlinks
Imagínate a Google como el centro de votación de Internet, contando los votos de todos los enlaces que encuentra en la web. A diferencia de tu democracia típica, donde una persona tiene un voto, Google da más peso a los votos de sitios web relevantes y autorizados. Por lo tanto, el factor más importante para determinar el ranking de Google tiende a basarse en esos pequeños enlaces azules que se ven en casi todos los sitios web.
Entonces, ¿cómo podrías conseguir estos enlaces azules? La forma más común es buscar vínculos de retroceso de la competencia a través de herramientas de SEO como [Open Site Explorer](https://moz.com/researchtools/ose/). Vea los vínculos de retroceso de Octoparse que encuentro en Open Site Explorer a continuación.
https://preview.redd.it/vwvori5g99g71.png?width=1085&format=png&auto=webp&v=enabled&s=d15f4b49a32a1f6f2946d814f13cc699214fe68c
Pero el problema es cómo puedo obtener esta información sin actualizar mi cuenta a una premium. ¡La respuesta es usar una herramien...
p0truj
u_melisaxinyue
melisaxinyue
t3_p0truj
https://www.reddit.com/r/u_melisaxinyue/comments/p0truj/una_forma_fácil_y_gratuita_para_mejorar_tu/
8/9/2021 3:55:06 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Una forma fácil y gratuita para mejorar tu ranking de Google
False
1
p0truj
0
7400
5
5
Red
10
Dash Dot Dot
20
No
835
Posted
9/27/2021 6:21:01 AM
# Screen Scraping
**Por lo general, se refiere a analizar el HTML en el contenido web generado con programas diseñados para extraer patrones específicos de contenido.**
El raspado de pantalla es el método de recopilar datos de visualización de pantalla de una aplicación y traducirlos para que otra aplicación pueda mostrarlos. Normalmente, esto se hace para capturar datos de una aplicación heredada con el fin de mostrarlos utilizando una interfaz de usuario más moderna.
A veces se confunde con el raspado de contenido, que es el uso de medios manuales o automáticos para extraer contenido de un sitio web sin la aprobación del propietario del sitio web. Muy a menudo, el raspado de pantalla se refiere a un cliente web que analiza las páginas HTML del sitio web de destino para extraer datos formateados.
## Screen Scrapers
Un [raspador de pantalla](https://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping) es un programa de computadora que utiliza una técnica de raspado de pantalla para traducir entre programas de aplicación heredados (escritos para comunicarse con dispositivos de entrada / salida e interfaces de usuario ahora generalmente obsoletos) y nuevas interfaces de usuario para que la lógica y los datos asociados con los programas heredados puede seguir utilizándose.
https://preview.redd.it/hd1t2zrznzp71.png?width=591&format=png&auto=webp&v=enabled&s=4ed660296b44200690b389fc85e014d34e83ec33
En los primeros días de las PC, los raspadores de pantalla emulaban un terminal (por ejemplo, IBM 3270) y pretendían ser un usuario para extraer y actualizar información de forma interactiva en el mainframe. En tiempos más recientes, el concepto se aplica a cualquier aplicación que proporcione una interfaz a través de páginas web.
**¿Para qué se usa Screen Scrapers?**
Los raspadores de pantalla se han aplicado en una amplia cantidad de campos para una variedad de casos de uso. Algunos usos potenciales incluyen:
* aplicaciones bancarias y transacciones financieras
* guardar datos significativos para su uso posterior
* para realizar acciones que un usuario haría en un sitio web
* para traducir datos de una aplicación heredada a una aplicación moderna
* para agregadores de datos, como sitios web de comparación de precios
* para rastrear perfiles de usuario para ver actividades en línea; y
* para obtener datos
[Top 10 industrias que utilizan screen scraping](https://preview.redd.it/qmxfx6l1ozp71.png?width=832&format=png&auto=webp&v=enabled&s=652f500d1aac1659e3187a789dd759e1aae97217)
Uno de los casos de uso más importantes ha sido el de la banca. Es posible que los prestamistas deseen utilizar el raspado de pantalla para recopilar los datos financieros de un cliente. Las aplicaciones basadas en finanzas pueden usar el rastreo de pantalla para acceder a múltiples cuentas de un usuario, agregando toda la información en un solo lugar. Sin embargo, los usuarios deberían confiar explícitamente en la aplicación, ya que confían en esa organización con sus cuentas, datos de clientes y contraseñas. El raspado de pantalla también se puede utilizar para aplicaciones de proveedores de hipotecas.
Es posible que una organización también desee utilizar el raspado de pantalla para traducir entre programas de aplicaciones heredados y nuevas interfaces de usuario (UI) para que la lógica y los datos asociados con los programas heredados puedan seguir utilizándose. Esta opción rara vez se usa y solo se ve como una opción cuando otros métodos no son prácticos.
## Raspado de datos sin codificación
https://preview.redd.it/6h2abal5ozp71.jpg?width=700&format=pjpg&auto=webp&v=enabled&s=7ba46b915941ee3377d922acff9831abb21e819b
Si deseas probar la extracción, [Octoparse](https://www.octoparse.es/) te permite trabajar con datos dinámicos no estructurados con solo hacer clic en puntos de datos individuales y generará automáticamente un código eficiente para extraer datos. No se requiere codificación en este proceso. Además, te permite exportar datos a formatos de tu elección como Excel, JSON, CSV, TXT, HTML, incluso directamente a tu base de datos a través de API.
pwba05
u_melisaxinyue
melisaxinyue
t3_pwba05
https://www.reddit.com/r/u_melisaxinyue/comments/pwba05/para_qué_se_usa_el_screen_scraping_y_cómo/
9/27/2021 6:21:01 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
¿Para qué se usa el screen scraping y cómo construir uno?
False
0.5
pwba05
0
7400
5
5
2
0.328947368421053
0
0
0
0
326
53.6184210526316
608
Red
10
Dash Dot Dot
20
No
834
Posted
11/13/2020 9:42:42 AM
Imagínese si desea buscar algo en [Google](https://www.octoparse.es/blog/extraer-coordenadas-de-google-maps) y copiar todos los enlaces de resultados en un archivo de Excel para su uso posterior, ¿Qué debe hacer? Debe volverse loco cuando hace clic y copia y pega todos los enlaces manualmente. Puede preguntar: "¿Hay alguna máquina que automáticamente haga todo el trabajo por mí?"
¡Por supuesto que sí! ¡Existe un web scraper!
Un web scraper es una herramienta utilizada para extraer datos de sitios web. Puede recopilar o copiar automáticamente datos específicos de la web y colocar los datos en una base de datos local central u hoja de cálculo, para su posterior recuperación o análisis.
Se utiliza para el scraping de contactos, el monitoreo de cambios de precios en línea y la comparación de precios, el scraping de revisión de productos (para ver la competencia), la recopilación de listados de bienes raíces, la investigación y el seguimiento de la presencia y reputación en línea.
Pero es posible que le preocupe si necesita conocimientos de codificación para construir un web scraper de este tipo. ¡No te preocupes! Hay muchos raspadores web gratuitos para ayudarlo a construir su propio raspador sin codificación. ¡Este artículo presentará varios web scrapers para que pueda elegir!
[1. Import.io](https://www.import.io/)
&#x200B;
https://preview.redd.it/fu3zpxg8azy51.png?width=873&format=png&auto=webp&v=enabled&s=1a51a3ae68183600a4acba1267605ded66bebcc9
Import.io es un software web-based para el Web scraping. Usando algoritmos de aprendizaje automático altamente sofisticados, extrae texto, URL, imágenes, documentos e incluso capturas de pantalla de las páginas de listas y detalles con solo una URL que ingrese. Se puede acceder a los datos a través de API, XLSX/CSV, hoja de Google, etc. le permite programar cuándo obtener los datos y admite casi cualquier combinación de tiempo, días, semanas y meses, etc. Lo mejor es que incluso puede proporcionarle un informe de datos después de la extracción.
Aunque con todas estas potentes funciones, Import.io ha cancelado su versión gratuita y cada usuario puede obtener una prueba gratuita de 7 días. Actualmente tiene cuatro versiones pagas con un límite diferente para extractores, consultas y funciones: Essential ($299/mes), Professional ($1,999/year), Enterprise ($4,999/year) y Premium ($9,999/year).
[2. Parsehub](https://www.parsehub.com/)
&#x200B;
https://preview.redd.it/lma1te29azy51.png?width=929&format=png&auto=webp&v=enabled&s=a7d8bbeb98e431593748194f6980e0225a74241b
Parsehub, una aplicación de escritorio cloud-based para la minería de datos, es otro scraper fácil de usar con una interfaz de aplicación de gráficos.
Funciona con cualquier página interactiva y busca fácilmente a través de formularios, abre menús desplegables, inicia sesión en sitios web, hace clic en mapas y maneja sitios con desplazamiento infinito, pestañas y ventanas emergentes, etc. En la jerarquía de elementos, verá los datos extraídos en segundos. Le permite acceder a datos a través de API, CSV/Excel, hoja de Google o Tableau.
Parsehub es gratuito para iniciar, pero tiene un límite de velocidad de extracción (200 páginas en 40 minutos), páginas por ejecución (200 páginas) y el número de proyectos (5 proyectos) en el plan gratuito. Si necesita una alta velocidad de extracción o más páginas, es mejor que solicite el plan estándar ($149/mes) o el plan profesional ($499/mes).
[3. Mozenda](http://www.mozenda.com/)
Otro scraper web-based, Mozenda, también obtiene datos mágicamente al convertir los datos web, independientemente del tipo, en un formato estructurado.
Identifica automáticamente listas y lo ayuda a crear agentes que recopilan datos precisos en muchas páginas. No solo para scrape páginas web, Mozenda incluso le permite extraer datos de documentos como Excel, Word, PDF, etc. de la misma manera que extrae datos de páginas web. Admite la publicación de resultados en formato CSV, TSV, XML o JSON en una base de datos existente o directamente en herramientas de BI populares como Amazon Web Services o Microsoft Azure® para análisis y visualización rápidos.
Mozenda ofrece una prueba gratuita de 30 días y después puede elegir entre sus planes de precios flexibles. Tiene una versión profesional ($100/mes) y una versión empresarial ($450/mes), cada una con diferentes límites para procesar créditos, almacenamiento y agentes.
[4.Content Grabber](http://www.tucows.com/preview/1601497/Content-Grabber)
&#x200B;
https://preview.redd.it/egfnn1t9azy51.png?width=928&format=png&auto=webp&v=enabled&s=bbf1091ebff2e3ec9b3e7bffcf90d9a1aa6f3ea8
Content Grabber, con una interfaz de usuario típica de apuntar y hacer clic, se utiliza para extraer prácticamente cualquier contenido de casi cualquier sitio web y guardarlo como datos estructurados en el formato que elija, incluidos informes Excel, XML, CSV y la mayoría de las bases de datos.
Diseñado con el rendimiento y la escalabilidad como la máxima prioridad, Content Grabber tiene una gama de diferentes navegadores para lograr el máximo rendimiento en cada escenario, desde un navegador web totalmente dinámico hasta el navegador ultrarrápido de solo analizador HTML5. Aborda el problema de confiabilidad de frente y agrega un fuerte soporte para la depuración, el manejo de errores y el registro.
Puede descargar una prueba gratuita de 15 días con todas las características de una edición profesional pero un máximo de 50 páginas por agente en Windows. La suscripción mensual es de $149 para la edición profesional y $299 para una suscripción premium. Content Grabber permite a los usuarios comprar directamente licencias para ser propietarios permanentes del software.
[5. Octoparse](https://www.octoparse.es/)
&#x200B;
https://preview.redd.it/kosf8liaazy51.png?width=1918&format=png&auto=webp&v=enabled&s=ac45096624cd2f83c6091076931b4909f7ca28bb
Octoparse es un cloud-based web en la nube que le ayuda a extraer fácilmente cualquier información web sin codificación. Con una interfaz fácil de usar, puede manejar fácilmente todo tipo de sitios web, sin importar JavaScript, [AJAX](https://www.octoparse.es/tutorial-7/ajax) o cualquier sitio web dinámico. Su algoritmo avanzado de aprendizaje automático puede localizar con precisión los datos en el momento en que hace clic en ellos. Admite la configuración Xpath para localizar elementos web con precisión y la configuración Regex para volver a formatear los datos extraídos. Se puede acceder a los datos extraídos a través de Excel/CSV o API, o exportarlos a su propia base de datos. Octoparse tiene una poderosa plataforma en la nube para lograr características importantes como la extracción programada y la rotación automática de IP.
Todos estos web scrapers pueden satisfacer básicamente diversas necesidades de extracción y software como Octoparse, incluso tienen blogs para compartir noticias y casos de extracción de datos, pero es importante tener en cuenta las funciones, limitaciones y, por supuesto, el precio de diferentes programas de acuerdo con sus requisitos individuales. Es una suerte que todos los productos ofrezcan una prueba gratuita antes de comprarlo.
¡Espero que el web scraping ya no sea un problema para ti con estos [scrapers](https://www.octoparse.es/blog/servicio-de-web-crawler)!
jtebqy
u_melisaxinyue
melisaxinyue
t3_jtebqy
https://www.reddit.com/r/u_melisaxinyue/comments/jtebqy/sí_existe_tal_cosa_como_un_web_scraper_chrome/
11/13/2020 9:42:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
¡Sí, Existe Tal Cosa como Un Web Scraper Chrome!
False
1
jtebqy
0
7400
5
5
5
0.462534690101758
2
0.185013876040703
0
0
615
56.8917668825162
1081
Red
10
Dash Dot Dot
20
No
833
Posted
9/30/2021 9:26:39 AM
https://i.redd.it/g97rjk110mq71.jpg
pyg7l9
visualization
melisaxinyue
t3_pyg7l9
https://www.reddit.com/r/visualization/comments/pyg7l9/httpsoctoparseesbloglas9mejoresherramientasdevisua/
9/30/2021 9:26:39 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
https://octoparse.es/blog/las-9-mejores-herramientas-de-visualizacion-de-datos-para-no-desarrolladores
False
0.27
pyg7l9
0
7400
5
5
Red
10
Dash Dot Dot
20
No
832
Posted
7/22/2020 9:57:39 AM
**Web Scraping** (también conocido como extracción de datos de la web, web crawling) se ha aplicado ampliamente en muchos campos hoy en día. Antes de que una herramienta de web scraping llegue al público, es la palabra mágica para personas normales sin habilidades de programación. Su alto umbral sigue bloqueando a las personas fuera de Big Data. **Una** [herramienta de web scraping](https://www.octoparse.es/blog/las-20-mejores-herramientas-de-web-scraping) **es la tecnología de rastreo automatizada y cierra la brecha entre Big Data y cada persona.**
**¿Cuáles son los beneficios de usar una herramienta de** [web spider](https://www.octoparse.es/blog)**?**
* Libera tus manos de hacer trabajos repetitivos de copiar y pegar.
* Coloca los datos extraídos en un formato bien estructurado que incluye, entre otros, Excel, HTML y CSV.
* Le ahorra tiempo y dinero al obtener un analista de datos profesional.
* Es la cura para comercializador, vendedores, periodistas, YouTubers, investigadores y muchos otros que carecen de habilidades técnicas.
**Aquí está el trato**
Enumeré **20 MEJORES web scrapers** para usted como referencia. ¡Bienvenido a aprovecharlo al máximo!
**1.** [**Octoparse**](http://octoparse.es/)
**Octoparse** es un free online spider para extraer casi todo tipo de datos que necesita en los sitios web. Puede usar Octoparse para extraer un sitio web con sus amplias funcionalidades y capacidades. Tiene dos tipos de modo de operación: **Modo Asistente** y **Modo** **Avanzado**, para que los que no son programadores puedan aprender rápidamente. La interfaz fácil de apuntar y hacer clic puede guiarlo a través de todo el proceso de extracción. Como resultado, puede extraer fácilmente el contenido del sitio web y guardarlo en formatos estructurados como EXCEL, TXT, HTML o sus bases de datos en un corto período de tiempo.
Además, proporciona una **Programada Cloud Extracción** que le permite extraer los datos dinámicos en tiempo real y mantener un registro de seguimiento de las actualizaciones del sitio web.
También puede extraer sitios web complejos con estructuras difíciles mediante el uso de su configuración incorporada de Regex y XPath para localizar elementos con precisión. Ya no tiene que preocuparse por el bloqueo de IP. Octoparse ofrece Servidores Proxy IP que automatizarán las IP y se irán sin ser detectados por sitios web agresivos.
Para concluir, **Octoparse debería poder satisfacer las necesidades de rastreo de los usuarios, tanto básicas como avanzadas, sin ninguna habilidad de codificación.**
**2.** [**Cyotek WebCopy**](https://www.cyotek.com/cyotek-webcopy)
WebCopy es un website crawler gratuito que le permite copiar sitios parciales o completos localmente web en su disco duro para referencia sin conexión.
Puede cambiar su configuración para decirle al bot cómo desea rastrear. Además de eso, también puede **configurar alias de dominio, cadenas de agente de usuario, documentos predeterminados** y más.
Sin embargo, WebCopy no incluye un DOM virtual ni ninguna forma de análisis de JavaScript. Si un sitio web hace un uso intensivo de JavaScript para operar, es más probable que WebCopy no pueda hacer una copia verdadera. Es probable que no maneje correctamente los diseños dinámicos del sitio web debido al uso intensivo de JavaScript
**3.** [**HTTrack**](https://www.httrack.com/)
Como programa gratuito de rastreo de sitios web, HTTrack **proporciona funciones muy adecuadas para descargar un sitio web completo a su PC**. Tiene versiones disponibles para Windows, Linux, Sun Solaris y otros sistemas Unix, que cubren a la mayoría de los usuarios. Es interesante que HTTrack pueda reflejar un sitio, o más de un sitio juntos (con enlaces compartidos). Puede decidir la cantidad de conexiones que se abrirán simultáneamente mientras descarga las páginas web en "establecer opciones". Puede obtener las fotos, los archivos, el código HTML de su sitio web duplicado y reanudar las descargas interrumpidas.
Además, el soporte de proxy está disponible dentro de **HTTrack para maximizar la velocidad.**
HTTrack funciona como un programa de línea de comandos, o para uso privado (captura) o profesional (espejo web en línea). Dicho esto, HTTrack debería ser preferido por personas con habilidades avanzadas de programación.
**4**. [**Getleft**](https://sourceforge.net/projects/getleftdown/)
Getleft es un capturador de sitios web gratuito y fácil de usar. Le permite **descargar un sitio web completo** o cualquier página web individual. Después de iniciar Getleft, puede ingresar una URL y elegir los archivos que desea descargar antes de que comience. Mientras avanza, cambia todos los enlaces para la navegación local. Además, ofrece soporte multilingüe. ¡Ahora Getleft admite 14 idiomas! Sin embargo, solo proporciona compatibilidad limitada con Ftp, descargará los archivos pero no de forma recursiva.
En general, Getleft debería poder satisfacer las necesidades básicas de scraping de los usuarios sin requerir habilidades más sofisticadas.
**5**. [**Scraper**](https://chrome.google.com/webstore/detail/scraper/mbigbapnjcgaffohmbkdlecaccepngjd)
Scraper es una extensión de Chrome con funciones de extracción de datos limitadas, pero es útil para realizar investigaciones en línea. También permite **exportar los datos a las hojas de cálculo de Google**. Puede copiar fácilmente los datos al portapapeles o almacenarlos en las hojas de cálculo con OAuth. Scraper puede generar XPaths automáticamente para definir URL para scraping. No ofrece servicios de scraping todo incluido, pero puede satisfacer las necesidades de extracción de datos de la mayoría de las personas.
**6**. [**OutWit Hub**](https://addons.mozilla.org/en-US/firefox/addon/outwit-hub/)
OutWit Hub es un complemento de Firefox con docenas de funciones de extracción de datos para simplificar sus búsquedas en la web. Esta herramienta de web scraping puede navegar por las páginas y almacenar la información extraída en un formato adecuado.
OutWit Hub ofrece **una interfaz única para extraer pequeñas o grandes cantidades de datos por necesidad**. OutWit Hub le permite eliminar cualquier página web del navegador. Incluso puede crear agentes automáticos para extraer datos.
Es una de las herramientas de web scraping más simples, de uso gratuito y le ofrece la comodidad de extraer datos web sin escribir código.
**7.** [**ParseHub**](https://www.parsehub.com/)
Parsehub es un excelente web scraper que admite la recopilación de datos de sitios web que utilizan tecnología **AJAX, JavaScript, cookies**, etc. Su tecnología de aprendizaje automático puede leer, analizar y luego transformar documentos web en datos relevantes.
La aplicación de escritorio de Parsehub es compatible con sistemas como Windows, Mac OS X y Linux. Incluso puede usar la aplicación web que está incorporado en el navegador.
Como programa gratuito, no puede configurar más de cinco proyectos públicos en Parsehub. Los planes de suscripción pagados le permiten crear al menos 20 proyectos privados para scrape sitios web.
**8**. [**Visual Scraper**](http://visualscraper.blogspot.hk/)
VisualScraper es otro gran web scraper gratuito y sin codificación con una interfaz simple de apuntar y hacer clic. Puede obtener datos en **tiempo real** de varias páginas web y exportar los datos extraídos como **archivos** **CSV, XML, JSON o SQL.** Además de SaaS, VisualScraper ofrece un servicio de web scraping como servicios de entrega de datos y creación de servicios de extracción de software.
Visual Scraper permite a los usuarios programar un proyecto para que se ejecute a una hora específica o repetir la secuencia cada minuto, día, semana, mes o año. Los usuarios pueden usarlo para extraer noticias, foros con frecuencia.
**9.** [**Scrapinghub**](https://scrapinghub.com/)
Scrapinghub es una **Herramienta de Extracción de Datos basada Cloud** que ayuda a miles de desarrolladores a obtener datos valiosos. Su herramienta de scraping visual de código abierto permite a los usuarios raspar sitios web sin ningún conocimiento de programación.
Scrapinghub utiliza Crawlera, un rotador de proxy inteligente que admite eludir las contramedidas de robots p...
hvr1dp
webscraping
melisaxinyue
t3_hvr1dp
https://www.reddit.com/r/webscraping/comments/hvr1dp/las_20_mejores_herramientas_de_web_scraping_para/
7/22/2020 9:57:39 AM
1/1/0001 12:00:00 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Las 20 Mejores Herramientas de Web Scraping para Extracción de Datos
False
0.75
hvr1dp
0
7400
5
5
9
0.72463768115942
3
0.241545893719807
0
0
710
57.1658615136876
1242
Red
10
Dash Dot Dot
20
No
831
Commented
7/22/2020 9:58:53 AM
Hace clic para ver el artículo original: [Los 20 Mejores Herramientas de Web Scraping Gratis en 2020](https://www.octoparse.es/blog/las-20-mejores-herramientas-de-web-scraping)
fyuwe50
webscraping
melisaxinyue
t1_fyuwe50
https://www.reddit.com/r/webscraping/comments/hvr1dp/las_20_mejores_herramientas_de_web_scraping_para/fyuwe50/
7/22/2020 9:58:53 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hvr1dp
t3_hvr1dp
hvr1dp
0
hvr1dp
True
False
False
0
7400
5
5
0
0
0
0
0
0
18
62.0689655172414
29
Red
10
Dash Dot Dot
20
No
830
Posted
8/2/2021 6:23:46 AM
Con motivo del lanzamiento de un negocio en línea tiene un costo inicial mínimo o nulo, los aspirantes a empresarios probablemente se enfrentarán a varios rivales que pueden intentar rebajar sus precios. Por lo tanto, es importante monitorear a tus competidores para determinar qué productos ofrecen a qué precio.
[Monitorear las listas de productos de la competencia](https://www.octoparse.es/help) te brindará una gran cantidad de información valiosa sobre tus competidores; tal vez un rival bien financiado esté realizando pruebas de penetración en uno de sus nichos, o tal vez esté probando un nuevo modelo de precios de la competencia. Independientemente de tu negocio, saber qué están haciendo tus competidores es uno de los primeros pasos para el éxito.
**Tabla de contenido**
* [Identificar estrategias de precios óptimas utilizando listados de la competencia](https://www.octoparse.es/blog/seguimiento-de-la-competidores-para-la-estrategia-de-precios-y-la-planificacion-de-productos#Identifying%20optimal%20pricing%20strategies%20using%20competitor%E2%80%99s%20listings)
* [Comparación de la variedad de productos con los listados de la competencia](https://www.octoparse.es/blog/seguimiento-de-la-competidores-para-la-estrategia-de-precios-y-la-planificacion-de-productos#%3EComparing%20product%20assortment%20against%20competitors%E2%80%99%20listings)
&#x200B;
[Seguimiento de la Competidores para la Estrategia de Precios y la Planificación de Productos](https://preview.redd.it/rhocnlee1we71.png?width=1600&format=png&auto=webp&v=enabled&s=7a9c1f60e1511760ae82161d56f9084a39eaa670)
Identificar [estrategias de precios óptimas](https://www.octoparse.es/help) utilizando listados de la competencia
El precio es uno de los factores más importantes en la decisión de compra. Todos los clientes tienen en cuenta el precio al comprar un producto, y es más probable que los clientes investiguen el producto a medida que aumenta el precio. Por eso es importante saber cómo los competidores ponen precio a un producto similar al tuyo en un canal diferente y luego mantener un precio competitivo. Esto no siempre significa reducir los precios de tus productos—de hecho, es posible que descubras que has estado cobrando menos por un producto de lo que deberías. Para comenzar a crear una estrategia de precios utilizando los datos de la lista de la competencia, comience respondiendo las siguientes preguntas:
### ¿Cuál es mi modelo de precios?
Empezar por identificar el [modelo general de precios](https://www.volusion.com/blog/how-to-price-ecommerce-products-to-compete-online/) que mejor se adapte a tus productos. Si vendet repuestos de cartuchos de tóner a pequeñas empresas, probablemente estés trabajando en un modelo económico, tratando de rebajar los precios de los competidores. Si vendes bolsos de lujo a celebridades, bajar los precios podría afectar las ventas. Ya sea que estés trabajando en un modelo de precios basado en el costo o en el valor, es fundamental saber a cuánto se venden los productos o servicios de la competencia.
### ¿Cómo perciben mis clientes los valores de productos?
Quizás hayas reducido previamente el precio de un producto, pero finalmente no vieron un aumento en las ventas. Es probable que esto se deba a que tus clientes vieron tu producto de manera diferente a lo que transmitía el precio más bajo. Los precios más baratos no siempre determinan las decisiones de compra—en cambio, los consumidores están más dispuestos a pagar un precio que consideran "razonable" por el producto que están comprando. Por esta razón, los minoristas deben establecer precios basados en la percepción del producto para ofrecer el mejor valor percibido del producto. Comparar los precios con tus competidores es una forma de ayudarlo a comprender cómo los consumidores perciben productos similares.
## Comparación de la variedad de productos con los listados de la competencia
Con el monitoreo de la competencia, tu enfoque principal es realizar evaluaciones comparativas con tus competidores para descubrir brechas en el surtido. [Monitorear los productos que tus competidores](https://dataservice.octoparse.com/comercio-electronico-y-venta-minorista?__hstc=97730752.cfa4be011358393efe2a4d1b0e579f03.1626416084584.1627872776490.1627884756794.36&__hssc=97730752.4.1627884756794&__hsfp=1029763304) han agregado recientemente a sus tiendas puede brindarte información valiosa sobre las tendencias del mercado. Si tienes una tienda en línea que vende artículos deportivos y observa que tus competidores agregan más mancuernas de varias marcas, puedes suponer que ven que el valor de mercado de las mancuernas aumenta. Esto sugiere que deberías ampliar tu surtido si no tienes mancuernas en tu tienda.
### ¿Quiénes son mis verdaderos competidores?
Es esencial comprender la variedad de productos de tus competidores y estar preparado para adaptarte al mercado. Sin embargo, también es importante poder reconocer quiénes son tus verdaderos competidores. Con mucha frecuencia, cuando pensamos en competidores, pensamos en empresas dentro de la misma industria; si bien esto parece correcto en teoría, no siempre es cierto en la práctica. Compare con los rivales equivocados, y tu estrategia de precios y surtido se quedará corta.
Para encontrar el objetivo correcto, debes calcular el índice de precios. Un concepto tomado del sector económico, el índice de precios se utiliza para medir la tasa de inflación. En el ámbito del comercio electrónico, podemos utilizarlo para examinar cuánto impacto tendrán los competidores en tu negocio.
Para medir el índice de precios de un determinado producto (por ejemplo, las mancuernas mencionadas anteriormente), debes dividir el costo de la mancuerna de la competencia por el costo de la mancuerna de tu tienda y multiplicar por 100.
Lee sobre: [¿Por Qué Necesitas Un Raspador De Comercio Electrónico Para El seguimiento De la Competencia?](https://www.octoparse.es/help)
[Web Scraping en la solución Big Data](https://octoparsewebscraping.medium.com/web-scraping-in-the-big-data-solution-7d2804d41477)
**Índice de Precios = (Costo del Producto de los Competidores / Costo de Tu Producto) x 100**
El índice de precios de un producto no nos dirá nada valioso—debemos sumar el índice de precios de cada producto y dividirlo por la cantidad de productos para obtener el índice de precios promedio del producto. Repetir los cálculos anteriores nos proporcionará el índice de precios de cada producto por competidor. A partir de ahí, podemos conectar todos los puntos de datos a través de un gráfico visual y trabajar en las desviaciones para determinar qué competidor tiene el mayor impacto en nosotros.
&#x200B;
[ Índice de Precios](https://preview.redd.it/0t1stz8k1we71.jpg?width=666&format=pjpg&auto=webp&v=enabled&s=f2a22d6fdd1a453bc06e416e5fc4aadbc09c58cf)
**¿Cómo realizo el seguimiento de la competencia?**
Many data solution providers charge a lot of money just for competitor monitoring. However, despite the high cost, you still have to deal with the underlying problem of security. Web scraping tools like [Octoparse](https://www.octoparse.es/) serve as an alternative for prudent investors who are conservative on security while careful on spending. Octoparse provides businesses of any size with the ability to stay informed automatically, allowing retailers to keep an eye on all categories of each competitor across different web sources at a much lower cost.
Muchos proveedores de soluciones de datos cobran mucho dinero solo por el seguimiento de la competencia. Sin embargo, a pesar del alto costo, aún debes lidiar con el problema subyacente de la seguridad. Las herramientas de raspado web como Octoparse sirven como una alternativa para los inversores prudentes que son conservadores en materia de seguridad pero cuidadosos con el gasto. Octoparse brinda a las empresas de cualquier tamaño la capacidad de mantenerse informadas automáticamente, lo que permite a los minoristas vigilar todas las categorías de cada competidor en diferentes fuentes web a un costo mucho menor.
El dicho "Mantén a tus amigos cerca de ti y a tus enemigos ...
ow8xgq
u_melisaxinyue
melisaxinyue
t3_ow8xgq
https://www.reddit.com/r/u_melisaxinyue/comments/ow8xgq/seguimiento_de_competidores_para_la_estrategia_de/
8/2/2021 6:23:46 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Seguimiento de Competidores para la Estrategia de Precios y la Planificación de Productos para 2021
False
1
ow8xgq
0
7400
5
5
Red
10
Dash Dot Dot
20
No
829
Posted
11/13/2020 10:15:21 AM
Los datos de mapas son cada vez más importantes en la era de Internet, ya que generan valor comercial y ayudan a la toma de decisiones. Dichos datos se usan ampliamente en las industrias, por ejemplo, una empresa de catering puede decidir dónde abrir un nuevo restaurante analizando datos de mapas y competidores cercanos.
Al igual que el artículo [Las 20 mejores herramientas de extraer de la web ](https://www.octoparse.es/blog/las-20-mejores-herramientas-de-web-scraping)rápidamente, aquí seleccionamos los 5 mejores rastreadores de Google Maps en 2019 y escribimos reseñas sobre las características de los mejores crawlers que existen. Existen diferentes tipos de métodos para crear crawlers de Google Maps. ¡Pruebe los siguientes métodos y cree su propio crawler para obtener los datos que necesita!
1. [Places API of Google Maps Platform](https://developers.google.com/places/web-service/intro)
Sí, Google Maps Platform proporciona la API de Places para desarrolladores. Es una de las mejores formas de recopilar datos de lugares de Google Maps, y los desarrolladores pueden obtener información actualizada sobre millones de ubicaciones utilizando solicitudes HTTP a través de la API.
Antes de usar la API de Places, debe configurar una cuenta y crear su propia clave de API. La API de Places no es gratuita y utiliza un modelo de precios de pago por uso. Sin embargo, los campos de datos proporcionados están limitados por la API de Places y, por lo tanto, es posible que no obtenga todos los datos que necesita.
2. [Octoparse](https://www.octoparse.es/)
Octoparse es una poderosa herramienta de web scraping para no programadores en la que puede crear rastreadores para scrape datos. Con varios clics, puede convertir los sitios web en datos valiosos. Las características dentro de Octoparse le permiten personalizar los rastreadores para lidiar con el 99% de la estructura complicada de sitios web y capturar datos. [Haga clic ](https://www.octoparse.es/blog/extraer-coordenadas-de-google-maps)para ver el tutorial para obtener el mapa de Google
Además, hay plantillas de tareas para ciertos sitios web, incluido Google Maps en Octoparse, que hacen que el web scraping sea más fácil y accesible para cualquier persona. Simplemente ingrese palabras clave o URL y la plantilla comenzará a raspar los datos automáticamente.
Los rastreadores creados con Octoparse, incluidas las plantillas, se pueden ejecutar tanto en máquinas locales como en la nube. Aunque Octoparse es potente y fácil de usar, aún necesitará aprender a construir su propia tarea, lo que puede llevar un poco de tiempo.
3. Marco o biblioteca de Python
Puede utilizar las potentes bibliotecas o marcos de Python, como [Scrapy](https://scrapy.org/) y [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/), para personalizar su crawler y scrape exactamente lo que desea. Para ser específicos, Scrapy es un marco que se utiliza para descargar, limpiar, almacenar datos de las páginas web, y tiene una gran cantidad de código incorporado para ahorrarle tiempo, mientras que BeautifulSoup es una biblioteca que ayuda al programador a extraer rápidamente datos de páginas web.
De esta manera, debe escribir códigos usted mismo para construir el rastreador y lidiar con todo. Por lo tanto, solo aquellos programadores que dominan el raspado web son competentes en este proyecto.
4. Proyectos de código Abierto en GitHub
Algunos proyectos para crawling Google Maps se pueden encontrar en GitHub, como este proyecto escrito en Node.js[.](https://github.com/thiago-js/scraping-google-maps) Ya hay muchos proyectos excelentes de código abierto que han sido creados por otros, por lo que no necesitamos reinventarlos.
Incluso si no necesita escribir la mayoría de los códigos usted mismo, debe conocer los rudimentos y escribir algunos códigos para ejecutar el script, lo que dificulta a aquellos que saben poco sobre codificación. La cantidad y la calidad del conjunto de datos dependen en gran medida del proyecto de código abierto en GitHub, el proyecto carece de mantenimiento. Además, la salida solo puede ser un archivo .txt. Por lo tanto, si necesita muchos datos, este puede que no sea la mejor manera de obtenerlos.
5. Web Scraper
Web Scraper es la extensión de scraping web más popular. Descargue el navegador Google Chrome e instale la extensión[ Web Scraper](https://chrome.google.com/webstore/detail/web-scraper/jnhgnonknehpejjnehehllkliplmbmhn) y puede comenzar a usarlo. No tiene que escribir códigos o descargar software para raspar datos, una extensión de Chrome será suficiente para la mayoría de los casos.
Sin embargo, la extensión no es tan poderosa cuando se maneja una estructura compleja de páginas web o se raspan algunos datos pesados.
jtenms
u_melisaxinyue
melisaxinyue
t3_jtenms
https://www.reddit.com/r/u_melisaxinyue/comments/jtenms/5_mejores_web_scrapers_de_google_maps_en_2020/
11/13/2020 10:15:21 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
5 Mejores Web Scrapers de Google Maps en 2020
False
1
jtenms
0
7400
5
5
3
0.392670157068063
1
0.130890052356021
0
0
417
54.5811518324607
764
Red
10
Dash Dot Dot
20
No
828
Posted
9/30/2021 7:33:56 AM
# ¿Qué es el web scraping?
El [web scraping](https://www.octoparse.es/), también conocido como [web harvesting ](https://www.octoparse.es/blog/30-mejores-software-gratuitos-de-web-scraping)y [extracción de datos web](http://www.dataextraction.io/), se refiere básicamente a la recopilación de datos de sitios web a través del Hypertext Transfer Protocol (HTTP) o mediante navegadores web.
Es una técnica web para extraer datos de la web. Convierte datos no estructurados o código fuente sin procesar en datos estructurados que puedes almacenar en tu computadora local o en una base de datos. Por lo general, los datos disponibles en Internet solo se pueden ver desde un navegador web. Casi todos los sitios web no brindan a los usuarios la funcionalidad para extraer la información que se muestra en la web. La única forma de obtener la información es mediante la acción repetitiva de copiar y pegar. Es una tarea tediosa y que requiere mucho tiempo capturar y separar manualmente estos datos.
Afortunadamente, la técnica de web scraping puede ejecutar el proceso automáticamente y organizarlos en minutos.
## ¿Cómo funciona el web scraping?
En general, el web scraping implica tres pasos:
* **Primero**, enviamos una solicitud GET al servidor y recibiremos una respuesta en forma de contenido web.
* **A continuación**, analizamos el código HTML de un sitio web siguiendo una ruta de estructura de árbol.
* **Finalmente**, usamos la python library para buscar el parse tree.
&#x200B;
## ¿Cómo comenzó todo?
Aunque para muchas personas, suena como una técnica tan fresca como conceptos como "Big Data" o "machine learning", la historia del web scraping es en realidad mucho más larga. Se remonta a la época en que nació la World Wide Web, o coloquialmente "Internet"
Al principio, Internet era incluso inescrutable. Antes de que se desarrollaran los motores de búsqueda, Internet era solo una colección de sitios de File Transfer Protocol (FTP) en los que los usuarios navegaban para encontrar archivos compartidos específicos. Para encontrar y organizar los datos distribuidos disponibles en Internet, las personas crearon un programa automatizado específico, conocido hoy como el **web crawler/bot**, para **buscar todas las páginas** en Internet y luego **copiar todo el contenido** en las bases de datos para su indexación.
Luego, Internet se crece y se convierte en el hogar de millones de páginas web que contienen una gran cantidad de datos en múltiples formas, incluidos textos, imágenes, videos y audios. Se convierte en una fuente de datos abierta.
A medida que la fuente de datos se hizo increíblemente rica y fácil de buscar, la gente comienzan a descubrir que la información requerida se puede encontrar fácilmente. Esta información generalmente se encuentra dispersa en muchos sitios web, pero el problema es que cuando desean obtener datos de Internet, no todos los sitios web ofrecen la opción de descargar datos. Copiar y pegar es muy engorroso e ineficiente.
Y ahí es donde entró el web scraping. El web scraping en realidad está impulsado por web bots/crawlers, y sus funciones son las mismas que las utilizadas en los motores de búsqueda. Es decir, **buscar y copiar**. La única diferencia podría ser la escala. El web scraping se centra en extraer solo datos específicos de ciertos sitios web, mientras que los motores de búsqueda a menudo obtienen la mayoría de los sitios web en Internet.
## - ¿Cómo se desarrolla el web scraping?
* **1989 El nacimiento de la World Wide Web**
Técnicamente, la World Wide Web es diferente de Internet. El primero se refiere al **espacio** de información, mientras que el segundo es la **network** compuesta por computadoras.
Gracias a Tim Berners-Lee, el inventor de WWW, trajo las siguientes 3 cosas que han sido parte de nuestra vida diaria:
* Localizadores Uniformes de Recursos (URL) que utilizamos para ir al sitio web que queremos;
* embedded hyperlinks que nos permiten navegar entre las páginas web, como las páginas de detalles del producto en las que podemos encontrar especificaciones del producto y muchas otras cosas como "los clientes que compraron esto también compraron";
* páginas web que contienen no solo textos, sino también imágenes, audios, videos y componentes de software.
* **1990 El primer navegador web**
También inventado por Tim Berners-Lee, se llamaba WorldWideWeb (sin espacios), llamado así por el proyecto WWW. Un año después de la aparición de la web, las personas tenían una forma de verla e interactuar con ella.
* **1991 El primer servidor web http:// web page**
La web siguió creciendo a una velocidad bastante moderada, en 1991 Tim Berners-Lee realizó el anuncio oficial de la World Wide Web y distribuyó el primer software de servidor web, con lo que marcaría el debut de esta herramienta como un servicio público en internet y cambiaría la historia para siempre. Para 1994, el número de servidores HTTP era superior a 200.
* **1993 Primer robot web - World Wide Web Wanderer**
En el 1993, Matthew Gray, quien estudió física en el Instituto de Tecnología de Massachusetts (MIT) y fue uno de los tres miembros de la Junta de Procesamiento de Información Estudiantil (SIPB) que creó el sitio www.mit.edu, decidió escribir un programa, llamado World Wide Web Wanderer, para recorrer sistemáticamente la Web y recopilar sitios.
Wanderer fue funcional por primera vez en la primavera de 1993 y se convirtió en el primer agente web automatizado (araña o rastreador web). El Wanderer ciertamente no llegó a todos los sitios de la Web, pero se ejecutó con una metodología coherente y, con suerte, arrojó datos coherentes para el crecimiento de la Web.
* **El diciembre de 1993 Primer motor de búsqueda crawler-based web JumpStation**
Como no había tantos sitios web disponibles en la web, los motores de búsqueda en ese momento solían depender de los administradores de sus sitios web humanos para recopilar y editar los enlaces en un formato particular.
JumpStation trajo un nuevo salto. Es el primer motor de búsqueda WWW que se basa en un robot web.
Desde entonces, la gente comenzó a usar estos web crawlers programáticos para recolectar y organizar Internet. Desde Infoseek, Altavista y Excite, hasta Bing y Google hoy, el núcleo de un robot de motor de búsqueda sigue siendo el mismo:
Como las páginas web están diseñadas para usuarios humanos, y no para la facilidad de uso automatizado, incluso con el desarrollo del bot web, todavía fue difícil para los ingenieros informáticos y los científicos hacer scraping web, y mucho menos personas normales. Por lo tanto, la gente se ha dedicado a hacer que el web scraping esté más disponible.
* **2000 Web API y API crawler**
API significa **Interfaz de Programación de Aplicaciones**. Es una interfaz que facilita mucho el desarrollo de un programa al proporcionar los bloques de construcción.
En 2000, Salesforce y eBay lanzaron su propia API, con la cual los programadores pudieron acceder y descargar algunos de los datos disponibles al público.
Desde entonces, muchos sitios web ofrecen API web para que las personas accedan a su base de datos pública.
*Enviar una solicitud HTTP pegada juntos, recibir JSON o XML a cambio*
Web APIs recopilan solo los datos proporcionados por el sitio web ,ofrecen a los desarrolladores una forma más amigable de hacer web scraping.
* **2004 Python Beautiful soup**
No todos los sitios web ofrecen API. Incluso si lo hacen, no proporcionan todos los datos que desean. Por lo tanto, los programadores todavía estaban trabajando en el desarrollo de un enfoque que pudiera facilitar el web scraping.
En 2004, Beautiful Soup fue lanzado. Es una biblioteca diseñada para Python.
En la programación de computadoras, una biblioteca es una colección de módulos de script, como los algoritmos de uso común, que permiten su uso sin reescritura, lo que simplifica el proceso de programación.
Con comandos simples, Beautiful Soup tiene sentido de la estructura del sitio y ayuda a analizar el contenido desde el contenedor HTML. Se considera la biblioteca más sofisticada y avanzada para el raspado web, y también uno de los enfoques más comunes y populares en la actuali...
pyew3s
u_melisaxinyue
melisaxinyue
t3_pyew3s
https://www.reddit.com/r/u_melisaxinyue/comments/pyew3s/servicios_de_web_scraping_cómo_comenzó_y_qué/
9/30/2021 7:33:56 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Servicios De Web Scraping: Cómo Comenzó y Qué Sucederá en El Futuro
False
1
pyew3s
0
7400
5
5
Red
10
Dash Dot Dot
20
No
827
Posted
8/16/2021 7:35:44 AM
Con el aumento del número de compradores en línea, los clientes se adaptan gradualmente al modelo de comercio electrónico y se vuelven más exigentes. El cambio en el comportamiento de compra ciertamente crea más oportunidades pero desafíos para los transportistas directos. Para la mayoría de los propietarios de negocios de envío directo, ahora la pregunta es, ¿cómo subir de nivel tu negocio para destacarse de la competencia y obtener continuamente más clientes?
Probablemente olfatees la nueva tendencia: [D2C](https://www.sana-commerce.com/e-commerce-terms/what-is-d2c-e-commerce/) (Direct-to-customers). Aunque el término ha estado flotando durante un par de años, pocas personas prestaron atención hasta que ocurrió la pandemia de COVID-19. Con la creciente demanda de comercio electrónico que elimina la barrera entre los vendedores y los consumidores, ahora D2C es el nuevo negro.
[Original Image](https://preview.redd.it/m2hjcrowaoh71.png?width=1600&format=png&auto=webp&v=enabled&s=457ae6ab71c22c8b3d5824f03cf6ff589d9ea0bd)
Ciertamente, hay algunas cosas que podemos tomar prestadas del modelo de negocio D2C y usarlo en el negocio de dropship.
“Según el análisis de [eMarketer](https://www.emarketer.com/content/why-more-brands-should-leverage-d2c-model) en febrero de 2021, las ventas de comercio electrónico de D2C en los EE. UU. Han crecido un 45,5% en 2020, generando alrededor de $ 111,54 mil millones y representando el 14% de las ventas totales de comercio electrónico minorista. Se espera que D2C mantenga un crecimiento relativamente constante cada año hasta 2023, momento en el que las ventas de comercio electrónico de D2C podrían haber alcanzado los $ 174,98 mil millones ".
**Tabla de contenidos**
[¿Qué es D2C y qué lo hace tan popular?](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h1)
[La diferencia entre D2C y dropshipping](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h2)
[Estrategia 1 de ](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)[Drop-ship: comenzar con los proveedores adecuados](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)
[Estrategia 2 de ](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)[Drop-ship: muestra tu marca con un escaparate en línea](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)
[Estrategia 3 de ](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)[Drop-ship: subir de nivel los servicios](https://www.octoparse.es/blog/como-pueden-los-dropshippers-aprender-del-negocio-d2c#h3)
## ¿Qué es D2C y qué lo hace tan popular?
Bajo un modelo D2C o Direct-to-Customer, los propietarios de marcas pueden vender productos directamente a sus clientes en los sitios web oficiales de sus marcas, en lugar de depender de tiendas físicas, mercados de comercio electrónico o cualquier otra plataforma de intermediarios.
Esto ha trascendido el modelo de negocio tradicional B2C de muchas formas. Los propietarios de negocios que apliquen una estrategia D2C obtendrán más control sobre sus marcas sin que un minorista se interponga en el medio. Como resultado, pueden construir relaciones más estrechas con los clientes finales y ser más interactivos con la demanda del mercado.
## ¿En qué se diferencia D2C de Dropshipping?
No mezcla estos dos conceptos. Los dropshippers manejan los pedidos de los clientes directamente y luego envían los productos utilizando un proveedor externo. Como ya has notado, el costo del almacenamiento físico constituye el margen de oportunidad. Los dropshippers no tienen ninguna inversión inicial para mantener el inventario, PERO no es sostenible a largo plazo. Considerando que, D2C no suena como un enfoque esbelto, ya que no solo debes pagar por el almacenamiento, sino también por todo lo relacionado con la operación. Sin embargo, **los partidarios del modelo D2C saben que aquellos que se acercan a los clientes pueden mantenerse firmes hasta el final de la competencia.**
Si deseas construir un negocio de Dropshipping sostenible como el modelo D2C, no es difícil darte cuenta de que solo necesitas algunas cosas para salir adelante: **proveedores confiables, conocer a** **tus clientes y una solución de cumplimiento de pedidos nivelada. Así es como se ve una estrategia de envío directo en un nivel alto:**
## Comenzando con los proveedores adecuados:
Un proveedor confiable que vende productos que se ajustan al nicho de marketing podría ganarte más probabilidades en el juego. Cuando pensamos en proveedores, pensamos en Aliexpress u otras empresas locales que pueden ofrecer productos de calidad a un costo menor.
Con tantas ideas de productos para mirar y tanta información disponible en línea, buscar una buena opción puede ser más desafiante a través de la investigación de mercados. Así es como el web scraping viene a tu rescate.
[**El wep scraping**](https://www.octoparse.es/blog/introduccion-a-las-tecnicas-y-herramientas-de-web-scraping) **es la mejor práctica para recuperar datos web dispersos en un formato utilizable**. Puede recoger información de productos de múltiples fuentes en un formato organizado y estructurado. Eventualmente, los datos se pueden sincronizar con tu tienda en línea a través de una [integración de API](https://helpcenter.octoparse.es/hc/es/articles/1500000914301-Conectar-la-API-de-Octoparse-paso-a-paso), que conecta tus tiendas y proveedores en línea sin problemas. Puede proporcionarte información básica del producto que incluye: precios, nombre del producto, SKU, inventario y URL de imágenes.
**A continuación, recopila datos para comprender verdaderamente a tus clientes.**
Cada acción del cliente ofrece información valiosa sobre el comportamiento del cliente. Para determinar cómo reaccionan los clientes ante tus productos o tus competidores, puedes [monitorear los productos](http://www.octoparse.es/blog/10-herramienta-de-monitoreo-de-precios) en todos los mercados. Esto te brinda más información sobre tu posición en el mercado. Herramientas como [**Octoparse**](https://www.octoparse.es/) pueden recopilar información como reseñas, revisores, calificaciones, niveles de existencias, etc.
Muestra tu marca con un escaparate en línea:
En lugar de depender de un mercado como Amazon, que tiene un millón de variedades de productos que compiten entre sí en cada línea de productos, tener una tienda en línea independiente te permite mostrar tus productos únicos. La mejor parte es que tienes control total sobre el tráfico y los clientes finales con los que puedes entablar relaciones y compartir una experiencia de marca personalizada.
El gigante de las maquinillas de afeitar Harry's optó por la estrategia de marketing exacta que lo ayudó a ganar más de $ 1 millón en ventas en el primer mes y rápidamente subió a $ 100 millones en 2 años. Andy Katz-Mayfield, cofundador de Harry's, se dio cuenta del hecho de que lo que los consumidores necesitan son productos simples pero efectivos que se sientan bien de usar. Este es un gran avance para iniciar su propio negocio de maquinillas de afeitar, que ofrece un modelo de venta simple: una gran maquinilla de afeitar entregada directamente a tu puerta.
¿Encuentra la diferencia? Harry's intenta ofrecer un viaje de compra simplificado pero personalizado, además de una historia de marca genuina que resuene entre los clientes. Esto hace que los clientes se sientan más conectados con la marca y alivia fácilmente sus preocupaciones sobre un nuevo negocio.
## Por último, subir de nivel los servicios de cumplimiento de pedidos y, por lo tanto, la experiencia del cliente.
Como dropshipper, probablemente te acostumbres a que los proveedores se ocupen de todo el proceso de cumplimiento del pedido y no le presten mucha atención. Pero si este trabajo de "última milla" se vuelve problemático, todo tu esfuerzo anterior para mejorar la experiencia del cliente se arruinará. Estado de pedido transparente, productos de calidad según lo prometido en los listados, paquetes con visibilidad de ma...
p5bcke
u_melisaxinyue
melisaxinyue
t3_p5bcke
https://www.reddit.com/r/u_melisaxinyue/comments/p5bcke/cómo_pueden_los_dropshippers_aprender_del_negocio/
8/16/2021 7:35:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
¿Cómo pueden los dropshippers aprender del negocio D2C?
False
1
p5bcke
0
7400
5
5
Red
10
Dash Dot Dot
20
No
826
Posted
9/1/2021 9:18:25 AM
El [**Web scraping**](https://octoparse.es/) (también denominado [**extracción** **datos de una web**](https://octoparse.es/download), web crawler, web scraper o web spider) es una web scraping técnica para extraer datos de una página web . Convierte datos no estructurados en datos estructurados que pueden almacenarse en su computadora local o en database.
Puede ser difícil crear un web scraping para personas que no saben nada sobre codificación. Afortunadamente, hay herramientas disponibles tanto para personas que tienen o no habilidades de programación. Aquí está nuestra lista de las 30 herramientas de web scraping más populares, desde bibliotecas de código abierto hasta extensiones de navegador y software de escritorio.
**Tabla de Contenido**
* Beautiful Soup
* Octoparse
* Import.io
* Mozenda
* Parsehub
* Crawlmonster
* Connotate
* Common Crawl
* Crawly
* Content Grabber
* Diffbot
* Dexi.io
* DataScraping.co
* Easy Web Extract
* FMiner
* Scrapy
* Helium Scraper
* Scrape.it
* Scrapinghub
* Screen-Scraper
* Salestools.io
* ScrapeHero
* UniPath
* Web Content Extractor
* WebHarvy
* Web Scraper.io
* Web Sundew
* Winautomation
* Web Robots
**1.** [**Beautiful Soup**](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
**Para quién sirve**: desarrolladores que dominan la programación para crear un web spider/web crawler.
**Por qué deberías usarlo**:Beautiful Soup es una biblioteca de Python de código abierto diseñada para scrape archivos HTML y XML. Son los principales analizadores de Python que se han utilizado ampliamente. Si tienes habilidades de programación, funciona mejor cuando combina esta biblioteca con Python.
Esta tabla resume las ventajas y desventajas de cada parser:-ParserUso estándarVentajasDesventajashtml.parser (puro)BeautifulSoup(markup, "html.parser")
* Pilas incluidas
* Velocidad decente
* Leniente (Python 2.7.3 y 3.2.)
No es tan rápido como lxml, es menos permisivo que html5lib.HTML (lxml)BeautifulSoup(markup, "lxml")
* Muy rápido
* Leniente
Dependencia externa de CXML (lxml)
BeautifulSoup(markup, "lxml-xml") BeautifulSoup(markup, "xml")
* Muy rápido
* El único parser XML actualmente soportado
Dependencia externa de Chtml5lib
BeautifulSoup(markup, "html5lib")
* Extremadamente indulgente
* Analizar las páginas de la misma manera que lo hace el navegador
* Crear HTML5 válido
* Demasiado lento
* Dependencia externa de Python
**2.** [**Octoparse**](https://octoparse.es/)
**Para quién sirve:** Las empresas o las personas tienen la necesidad de captura estos sitios web: comercio electrónico, inversión, criptomoneda, marketing, bienes raíces, etc. Este software no requiere habilidades de programación y codificación.
**Por qué deberías usarlo**: **Octoparse** es una plataforma de datos web SaaS gratuita de por vida. Puedes usar para [capturar datos](https://octoparse.es/) web y convertir datos no estructurados o semiestructurados de sitios web en un conjunto de datos estructurados sin codificación. También proporciona [task templates](https://helpcenter.octoparse.es/hc/es/articles/360039675314-Empieze-usar-Easy-Template-una-soluci%C3%B3n-de-web-scraping-para-principiantes) de los sitios web más populares de países hispanohablantes para usar, como Amazon.es, Idealista, Indeed.es, Mercadolibre y muchas otras. Octoparse también proporciona servicio de datos web. Puedes personalizar tu tarea de crawler según tus necesidades de scraping.
**PROS**
* Interfaz limpia y fácil de usar con un panel de flujo de trabajo simple
* Facilidad de uso, sin necesidad de conocimientos especiales
* Capacidades variables para el trabajo de investigación
* Plantillas de tareas abundantes
* Extracción de nubes
* Auto-detección
**CONS**
* Se requiere algo de tiempo para configurar la herramienta y comenzar las primeras tareas
**3.** [**Import.io**](https://www.import.io/)
**Para quién** **sirve:** Empresa que busca una solución de integración en datos web.
**Por qué deberías usarlo:** Import.io es una plataforma de datos web SaaS. Proporciona un software de web scraping que le permite extraer datos de una web y organizarlos en conjuntos de datos. Pueden integrar los datos web en herramientas analíticas para ventas y marketing para obtener información.
**PROS**
* Colaboración con un equipo
* Muy eficaz y preciso cuando se trata de extraer datos de grandes listas de URL
* Rastrear páginas y raspar según los patrones que especificas a través de ejemplos
**CONS**
* Es necesario reintroducir una aplicación de escritorio, ya que recientemente se basó en la nube
* Los estudiantes tuvieron tiempo para comprender cómo usar la herramienta y luego dónde usarla.
**4.** [**Mozenda**](https://www.mozenda.com/)
**Para quién** **sirve:** Empresas y negocios hay necesidades de fluctuantes de datos/datos en tiempo real.
**Por qué deberías usarlo:** Mozenda proporciona una herramienta de extracción de datos que facilita la captura de contenido de la web. También proporcionan servicios de visualización de datos. Elimina la necesidad de contratar a un analista de datos.
**PROS**
* Creación dinámica de agentes
* Interfaz gráfica de usuario limpia para el diseño de agentes
* Excelente soporte al cliente cuando sea necesario
**CONS**
* La interfaz de usuario para la gestión de agentes se puede mejorar
* Cuando los sitios web cambian, los agentes podrían mejorar en la actualización dinámica
* Solo Windows
**5.** [**Parsehub**](https://www.parsehub.com/)
**Para quién** **sirve:** analista de datos, comercializadores e investigadores que carecen de habilidades de programación.
**Por qué deberías usarlo:** ParseHub es un software visual de web scrapinng que puede usar para obtener datos de la web. Puede extraer los datos haciendo clic en cualquier campo del sitio web. También tiene una rotación de IP que ayudaría a cambiar su dirección IP cuando se encuentre con sitios web agresivos con una técnica anti-scraping.
**PROS**
* Tener un excelente boaridng que te ayude a comprender el flujo de trabajo y los conceptos dentro de las herramientas
* Plataforma cruzada, para Windows, Mac y Linux
* No necesita conocimientos básicos de programación para comenzar
* Soporte al usuario de muy alta calidad
**CONS**
* No se puede importar / exportar la plantilla
* Tener una integración limitada de javascript / regex solamente
**6.** [**Crawlmonster**](https://www.crawlmonster.com/)
**Para quién** **sirve:** SEO y especialistas en marketing
**Por qué deberías usarlo:** CrawlMonster es un software de web scraping gratis. Te permite escanear sitios web y analizar el contenido de tu sitio web, el código fuente, el estado de la página y muchos otros.
**PROS**
* Facilidad de uso
* Atención al cliente
* Resumen y publicación de datos
* Escanear el sitio web en busca de todo tipo de puntos de datos
**CONS**
* Funcionalidades no son tan completas
**7.** [**Connotate**](https://www.connotate.com/)
**Para quién** **sirve:** Empresa que busca una solución de integración en datos web.
**Por qué deberías usarlo:** Connotate ha estado trabajando junto con Import.io, que proporciona una solución para automatizar el scraping de datos web. Proporciona un servicio de datos web que puede ayudarlo a scrapear, recopilar y manejar los datos.
**PROS**
* Fácil de usar, especialmente para no programadores
* Los datos se reciben a diario y, por lo general, son bastante limpios y fáciles de procesar
* Tiene el concepto de programación de trabajos, que ayuda a obtener datos en tiempos programados
**CONS**
* Unos cuantos glitches con cada lanzamiento de una nueva versión provocan cierta frustración
* Identificar las faltas y resolverlas puede llevar más tiempo del que nos gustaría
**8.** [**Common Crawl**](https://commoncrawl.org/)
**Para quién** **sirve:** Investigador, estudiantes y profesores.
**Por qué deberías usarlo:** Common Crawl se basa en la idea del código abierto en la era digital. Proporciona conjuntos de datos abiertos de sitios web rastreados. Contiene datos sin procesar de la página web, metadatos extraídos y extracciones de texto.
Common Crawl es una [organización sin](https://es.wikipedia.org/wiki/Organizaci%C3%B3n_sin_%C3%A1nimo_de_lucro) [fines ...
pfq78c
u_melisaxinyue
melisaxinyue
t3_pfq78c
https://www.reddit.com/r/u_melisaxinyue/comments/pfq78c/los_30_mejores_software_gratuitos_de_web_scraping/
9/1/2021 9:18:25 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Los 30 Mejores Software Gratuitos de Web Scraping en 2021
False
1
pfq78c
0
7400
5
5
Red
10
Dash Dot Dot
20
No
825
Posted
8/5/2021 1:45:56 AM
La visualización de datos presenta la información y los datos en patrones visualizados que podrían ayudar a las personas a obtener información de manera efectiva. La [herramienta de data visualización](https://www.octoparse.es/) utiliza elementos visuales como gráficos y tablas para que los datos hablen. Existen muchas herramientas de visualización de datos en el mercado.
¿Cuál es la mejor herramienta de visualización de datos? Aquí hay una lista de las 30 mejores herramientas de visualización de datos en 2021, incluidos sus pros, contras y ejemplos. Luego puedes decidir cuál se adaptaría a tus necesidades.
Las dividimos en dos categorías: herramientas que no requieren programación y solo para desarrolladores. En cada categoría, las herramientas se clasifican en subgrupos según la especialización. Algunas como Tableau tienen una amplia gama de gráficos y tablas; Algunas herramientas como Infogram son bien conocidas para hacer infografías; Algunas herramientas comienzan a ganar popularidad debido a gráficos interactivos como Gephi.
https://preview.redd.it/1sjfxf1p2gf71.png?width=700&format=png&auto=webp&v=enabled&s=a94c3468ac5874a0ef378ad247687142be804b27
## Catálogo
* Herramientas para profesionales no tecnológicos
1. Tablas y gráficos
* [Gratis](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div1)
* [Comercial - Para particulares o empresas](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div2)
* [Comercial - Solo para empresas](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div3)
1. [Infografia](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div4)
2. [Mapas](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div5)
3. [Gráficos de Network ](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div6)
4. [Gráficos Matemáticos](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div7)
* Herramientas para desarrolladores
1. Tablas y gráficos
* [Gratis](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div8)
* [Comercial](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div9)
1. [Mapas](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div10)
2. [Gráficos de Network](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div11)
3. [Gráficos financieros](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div12)
[Conclusión](https://www.octoparse.es/blog/30-herramientas-de-visualizacion-de-datos#div13)
**Herramientas para profesionales no tecnológicos**
**1. Tablas y gráficos**
**Gratis:**
**1)** [**RAWGraphs**](https://rawgraphs.io/)
RAWGraphs es una herramienta web de código abierto y un marco de visualización de datos. Su objetivo es proporcionar un enlace faltante entre las aplicaciones de hoja de cálculo (por ejemplo, Microsoft Excel y Apple Numbers) y los editores de gráficos vectoriales (por ejemplo, Adobe Illustrator y Sketch). Simplemente puede insertar sus datos en RAWGraphs y personalizar sus gráficos y exportarlos como imágenes vectoriales (SVG) o ráster (PNG). Además, los datos cargados en RAW serán procesados solo por el navegador web, lo que garantiza la seguridad de los datos.
**Pros**
* Gratis ycódigo abierto
* Intuitivo y eficiente
* Tiene documento de ayuda
**Contras**
* No tiene muchas opciones ajustables
&#x200B;
**2)**[ **ChartBlocks**](https://www.chartblocks.com/)
ChartBlocks es una herramienta simple de creación de gráficos en línea, y su asistente de importación de datos puede guiarlo paso a paso para mostrarle cómo importar datos y diseñar gráficos. A diferencia de RAWGraphs, puede compartir fácilmente sus gráficos en las redes sociales. También puede exportar gráficos como gráficos vectoriales editables o insertar gráficos en sitios web con una cuenta personal gratuita. También se ofrecen cuentas profesionales y cuentas de élite.
**Pros**
* Hay disponibles planes de pago gratuitos y a precios razonables
* Asistente fácil de usar para importar los datos necesarios
**Contras**
* No está claro qué tan robusta es su API
* No parece tener ninguna capacidad de mapeo.
&#x200B;
**Comercial - para particulares o empresas**
Algunas herramientas de visualización de datos proporcionan diferentes planes pagos para individuos, pequeños equipos y organizaciones. Estas herramientas tienen más funciones y soporte técnico que las gratuitas.
**3)** [**Tableau**](https://www.tableau.com/)
Tableau es famosa en todo el mundo, lo que permite a las personas transformar los datos en visualización efectiva (cuadros, gráficos e incluso mapas). Tableau es una plataforma de análisis muy potente, segura y flexible, y puede arrastrar los datos a Tableau y graficarlos con sus colegas. También puede visualizar informes generados a través de escritorio, navegador, dispositivo móvil o incrustado en cualquier aplicación.
**Pros**
* Cientos de opciones de importación de datos
* Capacidad de mapeo
* Versión pública gratuita disponible
* Muchos videos tutoriales para guiarlo a través de cómo usar Tableau
**Contras**
* Las versiones que no son gratuitas son caras ($ 70 / mes / usuario para el software Tableau Creator)
* La versión pública no te permite mantener privados los análisis de datos
&#x200B;
**4)** [**Power BI**](https://powerbi.microsoft.com/)
Power BI es un conjunto de herramientas de análisis empresarial desarrolladas por Microsoft y, por lo tanto, bien integradas con Microsoft Office. Los usuarios pueden importar cualquier dato, como archivos, carpetas y bases de datos, y ver datos en cualquier lugar mediante el software, el editor web en línea y las aplicaciones móviles. Power BI es gratuito para usuarios individuales y solo cobra $9.9 por cada usuario del equipo por mes. Cualquier persona en el equipo puede analizar datos y tomar decisiones en cualquier momento.
### Pros
* Asequible y relativamente económico
* Ofrece una amplia gama de visualizaciones personalizadas
* Opción para cargar y ver sus datos en Excel
* Pueden importar datos de una amplia gama de fuentes de datos
* Actualizaciones rápidas
**Contras**
* No se puede manejar bien cuando hay relaciones complejas entre tablas
* No proporciona muchas opciones para configurar sus visualizaciones
* Interfaz de usuario atestada
&#x200B;
**5)** [**QlikView**](https://www.qlik.com/us/products/qlikview)
QlikView es una herramienta de inteligencia empresarial que se centra principalmente en los usuarios empresariales de las organizaciones, y los usuarios pueden analizar fácilmente sus datos y utilizar las capacidades de análisis e informes empresariales de QlikView para respaldar la toma de decisiones. QlikView también proporciona una edición personal para que los usuarios individuales puedan disfrutar de sus potentes funciones. Simplemente puede escribir las palabras clave que desea buscar dentro del conjunto de datos, y QlikView puede ayudarlo a encontrar información inesperada y asociaciones de datos.
**Pros**
* Proporciona un ecosistema dinámico de inteligencia empresarial para el usuario
* Compartir datos
* Bajo mantenimiento
* Ofrece muchas opciones de visualización de datos atractivas y coloridas.
**Contras**
* Menos límite de RAM
* Desarrollo de aplicaciones difíciles
* Requiere mucha compra adicional
&#x200B;
**6)** [**FineReport**](http://www.finereport.com/en/?utm_source=Octoparse&utm_medium=media&utm_term=30dvtools&utm_content=30dvtools)
FineReport es un software de informes y paneles con impresionantes efectos de visualización. Proporciona impresionantes gráficos HTML5 de desarrollo propio que se pueden mostrar sin problemas en cualquier sitio web o página web, con efectos 3D y dinámicos geniales. La visualización se adapta a cualquier tamaño de pantalla, desde televisores y pantallas grandes hasta dispositivos móviles. Las operaciones fáciles de arrastrar y soltar logran todos los efectos.
Además, lo sorprendente que descubrí es que FineReport es [GRATUITO para los usuarios individuales.](http://...
oy7hqn
u_melisaxinyue
melisaxinyue
t3_oy7hqn
https://www.reddit.com/r/u_melisaxinyue/comments/oy7hqn/las_30_mejores_herramientas_de_visualización_de/
8/5/2021 1:45:56 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Las 30 Mejores Herramientas de Visualización de Datos en 2021
False
1
oy7hqn
0
7400
5
5
Red
10
Dash Dot Dot
20
No
824
Posted
9/15/2021 9:39:16 AM
Probablemente sepas cómo usar funciones básicas en Excel. Es fácil hacer cosas como ordenar, aplicar filtros, hacer gráficos y delinear datos con Excel. Incluso puedes realizar análisis de datos avanzados utilizando modelos de pivote y regresión. Se convierte en un trabajo fácil cuando los datos en vivo se convierten en un formato estructurado.
El problema es, ¿Cómo podemos extraer datos y ponerlos en Excel? Esto puede ser tedioso si lo haces manualmente escribiendo, buscando, copiando y pegando repetidamente. En cambio, puedes lograr la extracciñon automática de datos de la web para sobresalir.
En este artículo, te presentaré varias formas de ahorrar tiempo y energía, scrapear datos web en Excel.
https://preview.redd.it/fz8zb7kxzmn71.png?width=1600&format=png&auto=webp&v=enabled&s=73fe8db80baef06dd1806e8ac5f7b39936bb01a3
Descargo de responsabilidad: Hay muchas otras formas de scrapear datos desde una web utilizando lenguajes de programación como PHP, Python, Perl, Ruby, etc. Aquí solo hablamos sobre cómo obtener datos de una web en Excel para no codificadores.
&#x200B;
**Tabla de contenidos**
Obtener datos web utilizando Excel Web Queries
Obtener datos de la web usando Excel VBA
Utilizar herramientas de web scraping automatizadas
Subcontratar tu proyecto de web scraping
## Obtener datos web utilizando Excel Web Queries
Excepto para transformar manualmente los datos de una página web copiando y pegando, Excel Web Queries se utiliza para recuperar rápidamente datos de páginas web estándar en hojas de cálculo de Excel. Puede detectar automáticamente tablas incrustadas en el HTML de la página web. Excel Web queries también se pueden usar en situaciones en las que es difícil crear o mantener una conexión estándar ODBC (Open Database Connectivity). Puede scrapear directamente una tabla desde cualquier sitio web utilizando Excel Web Queries.
El proceso se reduce a varios pasos simples (consulta [este artículo](https://www.excel-university.com/pull-external-data-into-excel)):
1. Ir a Datos> Obtener datos externos> Dar la web
2. Aparecerá una ventana del navegador llamada "New Web Query"
3. Escribir la dirección web en la barra de direcciones.
https://preview.redd.it/8rf2f82g0nn71.png?width=645&format=png&auto=webp&v=enabled&s=3ead140b70952b1a56c46c225fedce328639b60d
(foto de excel-university.com)
4. Se cargará y mostrará iconos amarillos contra datos/tablas en la página.
5. Seleccionar uno apropiado
6. Presionar el botón Importar.
Ahora has scrapeado los datos de la web en una hoja de cálculo de Excel, perfecta permutación en filas y columnas como desees.
https://preview.redd.it/3y1mws5h0nn71.png?width=845&format=png&auto=webp&v=enabled&s=85053ee29ac6de9ef4ac7c0445a196dd48c95b3f
## Obtener datos de la web usando Excel VBA
La mayoría de nosotros usaría fórmulas en Excel (p. Ej. = Avg (...), = sum (...), = if (...), etc.) mucho, pero menos familiarizado con el lenguaje incorporado: Visual BasicVisual Basic for Application a.k.a VBA. Se conoce comúnmente como "Macros" y dichos archivos de Excel se guardan como a \*\*.xlsm.
Antes de usarlo,
**Primero** debes habilitar la pestaña la pestaña Desarrollador en la barra (hacer clic con el botón derecho en Archivo -> Personalizar barra -> verificar la pestaña Desarrollador),
**Luego** configura tu diseño. En esta interfaz de desarrollador, puedes escribir código VBA adjunto a varios eventos. Haz clic AQUÍ (https://msdn.microsoft.com/en-us/library/office/ee814737(v=office.14).aspx) para comenzar a utilizar VBA en Excel 2010.
https://preview.redd.it/nz2u58qi0nn71.png?width=700&format=png&auto=webp&v=enabled&s=65bb4b9fa13478e027aac81971029fbf57df2f21
Usar Excel VBA va a ser un poco técnico, esto no es muy amigable para quienes no son programadores entre nosotros. VBA funciona ejecutando macros, procedimientos paso a paso escritos en Excel Visual Basic. Para scrapear datos de sitios web a Excel usando VBA, necesitamos construir u obtener un script VBA para enviar alguna solicitud a las páginas web y obtener datos devueltos de estas páginas web. Es común usar VBA con XMLHTTP y expresiones regulares para analizar las páginas web. Para Windows, puedes usar VBA con WinHTTP o InternetExplorer para scrapear datos de sitios web a Excel.
Con un poco de paciencia y práctica, te convendría aprender algo de código Excel VBA y algo de conocimiento HTML para que tu Web scraping en Excel sea mucho más fácil y eficiente para automatizar el trabajo repetitivo. Hay una gran cantidad de material y foros para que aprendas a escribir código VBA.
## Utilizar herramientas de web scraping automatizadas
Para alguien que está buscando una herramienta rápida para scrapear datos de las páginas a Excel y no quiere configurar el código VBA tú mismo, te recomiendo encarecidamente herramientas de web scraping automatizadas como Octoparse para scrapear datos para tu hoja de cálculo de Excel directamente o mediante API.
No hay necesidad de aprender a programar. Puedes elegir uno de esos programas gratuitos de web scraping de la [lista](https://www.octoparse.es/blog/las-20-mejores-herramientas-de-web-scraping) y comenzar a extraer datos de sitios web de inmediato y exportarlos a Excel. Las diferentes herramientas de web scraping tienen sus ventajas y desventajas, y puedes elegir la perfecta para tus necesidades.
## Subcontratar tu proyecto de web scraping
Si el tiempo es tu activo más valioso y deseas enfocarte en tus negocios principales, la mejor opción sería subcontratar un trabajo tan complicado de scrapear de contenido web a un equipo competente de scrapear de contenido web que tenga experiencia y conocimientos.
Es difícil scapear datos de sitios web debido al hecho de que la presencia de bots anti-scrape restringirá la práctica del web scraping. Un equipo competente de web scraping te ayudaría a obtener datos de los sitios web de manera adecuada y a entregarte datos estructurados en una hoja de Excel o en cualquier formato que necesites.
pomusl
webscraping
melisaxinyue
t3_pomusl
https://www.reddit.com/r/webscraping/comments/pomusl/4_formas_de_extraer_datos_del_sitio_web_a_excel/
9/15/2021 9:39:16 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
4 Formas de Extraer Datos del Sitio Web a Excel
False
0.43
pomusl
0
7400
5
5
30
3.43642611683849
0
0
0
0
513
58.7628865979381
873
Red
10
Dash Dot Dot
20
No
823
Posted
9/10/2021 8:13:04 AM
¿Cuánto sabes sobre web scraping? No te preocupe, este artículo te informará sobre **los conceptos básicos del web scraping**, cómo acceder a una **herramienta de web scraping** para obtener una herramienta que se adapte perfectamente a tus necesidades y por último, pero no por ello menos importante, te presentará **una lista de herramientas de web scraping** para tu referencia. **Tabla de contenidos** Web scraping y como se usa Cómo elegir una herramienta de web scraping Tres tipos de herramientas de raspado web Software de Web Scraping de Cliente
* 1. Octoparse
* 2. ParseHub
* 3. Import.io
Complementos / Extensión de Web Scraping
* 1. Data Scraper (Chrome)
* 2. Web scraper
* 3. Scraper (Chrome)
* 4. Outwit hub(Firefox)
Aplicación de raspado basada en web
* 1. Dexi.io
* 2. Webhose.io
## Web Scraping Y Como Se Usa
El web scraping es una forma de recopilar datos de páginas web con un bot de scraping, por lo que todo el proceso se realiza de forma automatizada. La técnica permite a las personas obtener datos web a gran escala rápidamente. Mientras tanto, instrumentos como **Regex** (Expresión Regular) permiten la limpieza de datos durante el proceso de raspado, lo que significa que **las personas pueden obtener datos limpios bien estructurados en un solo lugar**. **¿Cómo funciona el web scraping?**
* En primer lugar, un robot de raspado web simula el acto de navegación humana por el sitio web. Con la URL de destino ingresada, envía una solicitud al servidor y obtiene información en el archivo HTML.
* A continuación, con el código fuente HTML a mano, el bot puede llegar al nodo donde se encuentran los datos de destino y analizar los datos como se ordena en el código de raspado.
* Por último, (según cómo esté configurado el bot de raspado) el grupo de datos raspados se limpiará, se colocará en una estructura y estará listo para descargar o transferir a tu base de datos.
## Cómo Elegir Una Herramienta De Web Scraping
Hay formas de acceder a los datos web. A pesar de que lo has reducido a una herramienta de raspado web, las herramientas que aparecieron en los resultados de búsqueda con todas las características confusas aún pueden hacer que una decisión sea difícil de alcanzar. Hay algunas dimensiones que puedes tener en cuenta antes de elegir una herramienta de raspado web:
* **Dispositivo:** si eres un usuario de Mac o Linux, debes asegurarte de que la herramienta sea compatible con tu sistema.
* **Servicio en la nube:** el servicio en la nube es importante si deseas acceder a tus datos en todos los dispositivos en cualquier momento.
* **Integración:** ¿cómo utilizarías los datos más adelante? Las opciones de integración permiten una mejor automatización de todo el proceso de manejo de datos.
* **Formación:** si no sobresales en la programación, es mejor asegurarte de que haya guías y soporte para ayudarte a lo largo del viaje de recolección de datos.
* **Precio:** sí, el costo de una herramienta siempre se debe tener en cuenta y varía mucho entre los diferentes proveedores.
Ahora es posible que desees saber qué herramientas de raspado web puedes elegir:
## Tres Tipos De Herramientas De Raspado Web
* Cliente Web Scraper
* Complementos / Extensión de Web Scraping
* Aplicación de raspado basada en web
Hay muchas herramientas gratuitas de raspado web. Sin embargo, no todo el software de web scraping es para no programadores. Las siguientes listas son las mejores herramientas de raspado web sin habilidades de codificación a un bajo costo. El software gratuito que se enumera a continuación es fácil de adquirir y satisfaría la mayoría de las necesidades de raspado con una cantidad razonable de requisitos de datos.
### Software de Web Scraping de Cliente
#### 1. Octoparse
Octoparse es una herramienta robusta de web scraping que también proporciona un servicio de web scraping para empresarios y empresas.
* **Dispositivo**: como se puede instalar tanto en **Windows** como en **Mac OS**, los usuarios pueden extraer datos con dispositivos Apple.
* **Datos:** extracción de datos web para redes sociales, comercio electrónico, marketing, listados de bienes raíces, etc.
* **Función**:
\- manejar sitios web estáticos y dinámicos con AJAX, JavaScript, cookies, etc. - extraer datos de un sitio web complejo que **requiere inicio de sesión y paginación**. - tratar la información que no se muestra en los sitios web **analizando el código fuente**.
* **Casos de uso**: como resultado, puedes lograr un seguimiento automático de inventarios, monitoreo de precios y generación de leads al alcance de tu mano.
Octoparse ofrece diferentes opciones para usuarios con diferentes niveles de habilidades de codificación.
* **El Modo de Plantilla de Tareas** Un usuario con habilidades básicas de datos scraping puede usar esta nueva característica que convirte páginas web en algunos datos estructurados al instante. El modo de plantilla de tareas solo toma alrededor de 6.5 segundos para desplegar los datos detrás de una página y te permite descargar los datos a Excel.
* **El modo avanzado** tiene más flexibilidad comparando los otros dos modos. Esto permite a los usuarios configurar y editar el flujo de trabajo con más opciones. El modo avanzado se usa para scrape sitios web más complejos con una gran cantidad de datos.
* La nueva función de **detección automática** te permite crear un rastreador con un solo clic. Si no estás satisfecho con los campos de datos generados automáticamente, siempre puedes personalizar la tarea de raspado para permitirte raspar los datos por ti.
* Los **servicios en la nube** permiten una gran extracción de datos en un corto período de tiempo, ya que varios servidores en la nube se ejecutan simultáneamente para una tarea. Además de eso, el servicio en la nube te permitirá almacenar y recuperar los datos en cualquier momento.
#### 2. ParseHub
Parsehub es un raspador web que recopila datos de sitios web que utilizan tecnologías AJAX, JavaScript, cookies, etc. Parsehub aprovecha la tecnología de aprendizaje automático que puede leer, analizar y transformar documentos web en datos relevantes.
* **Dispositivo:** la aplicación de escritorio de Parsehub es compatible con sistemas como **Windows, Mac OS X y Linux**, o puedes usar la extensión del navegador para lograr un raspado instantáneo.
* **Precio:** no es completamente gratuito, pero aún puedes configurar hasta cinco tareas de raspado de forma gratuita. El plan de suscripción paga te permite configurar al menos 20 proyectos privados.
* **Tutorial:** hay muchos tutoriales en Parsehub y puedes obtener más información en la página de inicio.
#### 3. Import.io
Import.io es un software de integración de datos web SaaS. Proporciona un entorno visual para que los usuarios finales diseñen y personalicen los flujos de trabajo para recopilar datos. Cubre todo el ciclo de vida de la extracción web, desde la extracción de datos hasta el análisis dentro de una plataforma. Y también puedes integrarte fácilmente en otros sistemas.
* **Función:** raspado de datos a gran escala, captura de fotos y archivos PDF en un formato factible
* **Integración:** integración con herramientas de análisis de datos
* **Precios:** el precio del servicio solo se presenta mediante consulta caso por caso
### Complementos / Extensión de Web Scraping
#### 1. Data Scraper (Chrome)
Data Scraper puede extraer datos de tablas y datos de tipo de listado de una sola página web. Su plan gratuito debería satisfacer el scraping más simple con una pequeña cantidad de datos. El plan pagado tiene más funciones, como API y muchos servidores proxy IP anónimos. Puede recuperar un gran volumen de datos en tiempo real más rápido. Puede scrapear hasta 500 páginas por mes, si necesitas scrapear más páginas, necesitas actualizar a un plan pago.
#### 2. Web scraper
El raspador web tiene una extensión de Chrome y una extensión de nube.
* Para la versión de extensión de Chrome, puedes crear un mapa del sitio (plan) sobre cómo se debe navegar por un sitio web y qué datos deben rasparse.
* La extensión de la nube puede raspar un gran volumen de datos y ejecutar múltip...
plgssy
webscraping
melisaxinyue
t3_plgssy
https://www.reddit.com/r/webscraping/comments/plgssy/9_herramientas_de_web_scraping_gratuitas_que_no/
9/10/2021 8:13:04 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9 herramientas de Web Scraping Gratuitas que No Te Puedes Perder en 2021
False
0.25
plgssy
0
7400
5
5
5
0.387897595034911
1
0.0775795190069822
0
0
729
56.55546935609
1289
Red
10
Dash Dot Dot
20
No
822
Posted
9/15/2021 9:39:16 AM
Probablemente sepas cómo usar funciones básicas en Excel. Es fácil hacer cosas como ordenar, aplicar filtros, hacer gráficos y delinear datos con Excel. Incluso puedes realizar análisis de datos avanzados utilizando modelos de pivote y regresión. Se convierte en un trabajo fácil cuando los datos en vivo se convierten en un formato estructurado.
El problema es, ¿Cómo podemos extraer datos y ponerlos en Excel? Esto puede ser tedioso si lo haces manualmente escribiendo, buscando, copiando y pegando repetidamente. En cambio, puedes lograr la extracciñon automática de datos de la web para sobresalir.
En este artículo, te presentaré varias formas de ahorrar tiempo y energía, scrapear datos web en Excel.
https://preview.redd.it/fz8zb7kxzmn71.png?width=1600&format=png&auto=webp&v=enabled&s=73fe8db80baef06dd1806e8ac5f7b39936bb01a3
Descargo de responsabilidad: Hay muchas otras formas de scrapear datos desde una web utilizando lenguajes de programación como PHP, Python, Perl, Ruby, etc. Aquí solo hablamos sobre cómo obtener datos de una web en Excel para no codificadores.
&#x200B;
**Tabla de contenidos**
Obtener datos web utilizando Excel Web Queries
Obtener datos de la web usando Excel VBA
Utilizar herramientas de web scraping automatizadas
Subcontratar tu proyecto de web scraping
## Obtener datos web utilizando Excel Web Queries
Excepto para transformar manualmente los datos de una página web copiando y pegando, Excel Web Queries se utiliza para recuperar rápidamente datos de páginas web estándar en hojas de cálculo de Excel. Puede detectar automáticamente tablas incrustadas en el HTML de la página web. Excel Web queries también se pueden usar en situaciones en las que es difícil crear o mantener una conexión estándar ODBC (Open Database Connectivity). Puede scrapear directamente una tabla desde cualquier sitio web utilizando Excel Web Queries.
El proceso se reduce a varios pasos simples (consulta [este artículo](https://www.excel-university.com/pull-external-data-into-excel)):
1. Ir a Datos> Obtener datos externos> Dar la web
2. Aparecerá una ventana del navegador llamada "New Web Query"
3. Escribir la dirección web en la barra de direcciones.
https://preview.redd.it/8rf2f82g0nn71.png?width=645&format=png&auto=webp&v=enabled&s=3ead140b70952b1a56c46c225fedce328639b60d
(foto de excel-university.com)
4. Se cargará y mostrará iconos amarillos contra datos/tablas en la página.
5. Seleccionar uno apropiado
6. Presionar el botón Importar.
Ahora has scrapeado los datos de la web en una hoja de cálculo de Excel, perfecta permutación en filas y columnas como desees.
https://preview.redd.it/3y1mws5h0nn71.png?width=845&format=png&auto=webp&v=enabled&s=85053ee29ac6de9ef4ac7c0445a196dd48c95b3f
## Obtener datos de la web usando Excel VBA
La mayoría de nosotros usaría fórmulas en Excel (p. Ej. = Avg (...), = sum (...), = if (...), etc.) mucho, pero menos familiarizado con el lenguaje incorporado: Visual BasicVisual Basic for Application a.k.a VBA. Se conoce comúnmente como "Macros" y dichos archivos de Excel se guardan como a \*\*.xlsm.
Antes de usarlo,
**Primero** debes habilitar la pestaña la pestaña Desarrollador en la barra (hacer clic con el botón derecho en Archivo -> Personalizar barra -> verificar la pestaña Desarrollador),
**Luego** configura tu diseño. En esta interfaz de desarrollador, puedes escribir código VBA adjunto a varios eventos. Haz clic AQUÍ (https://msdn.microsoft.com/en-us/library/office/ee814737(v=office.14).aspx) para comenzar a utilizar VBA en Excel 2010.
https://preview.redd.it/nz2u58qi0nn71.png?width=700&format=png&auto=webp&v=enabled&s=65bb4b9fa13478e027aac81971029fbf57df2f21
Usar Excel VBA va a ser un poco técnico, esto no es muy amigable para quienes no son programadores entre nosotros. VBA funciona ejecutando macros, procedimientos paso a paso escritos en Excel Visual Basic. Para scrapear datos de sitios web a Excel usando VBA, necesitamos construir u obtener un script VBA para enviar alguna solicitud a las páginas web y obtener datos devueltos de estas páginas web. Es común usar VBA con XMLHTTP y expresiones regulares para analizar las páginas web. Para Windows, puedes usar VBA con WinHTTP o InternetExplorer para scrapear datos de sitios web a Excel.
Con un poco de paciencia y práctica, te convendría aprender algo de código Excel VBA y algo de conocimiento HTML para que tu Web scraping en Excel sea mucho más fácil y eficiente para automatizar el trabajo repetitivo. Hay una gran cantidad de material y foros para que aprendas a escribir código VBA.
## Utilizar herramientas de web scraping automatizadas
Para alguien que está buscando una herramienta rápida para scrapear datos de las páginas a Excel y no quiere configurar el código VBA tú mismo, te recomiendo encarecidamente herramientas de web scraping automatizadas como Octoparse para scrapear datos para tu hoja de cálculo de Excel directamente o mediante API.
No hay necesidad de aprender a programar. Puedes elegir uno de esos programas gratuitos de web scraping de la [lista](https://www.octoparse.es/blog/las-20-mejores-herramientas-de-web-scraping) y comenzar a extraer datos de sitios web de inmediato y exportarlos a Excel. Las diferentes herramientas de web scraping tienen sus ventajas y desventajas, y puedes elegir la perfecta para tus necesidades.
## Subcontratar tu proyecto de web scraping
Si el tiempo es tu activo más valioso y deseas enfocarte en tus negocios principales, la mejor opción sería subcontratar un trabajo tan complicado de scrapear de contenido web a un equipo competente de scrapear de contenido web que tenga experiencia y conocimientos.
Es difícil scapear datos de sitios web debido al hecho de que la presencia de bots anti-scrape restringirá la práctica del web scraping. Un equipo competente de web scraping te ayudaría a obtener datos de los sitios web de manera adecuada y a entregarte datos estructurados en una hoja de Excel o en cualquier formato que necesites.
pomusl
webscraping
melisaxinyue
t3_pomusl
https://www.reddit.com/r/webscraping/comments/pomusl/4_formas_de_extraer_datos_del_sitio_web_a_excel/
9/15/2021 9:39:16 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
4 Formas de Extraer Datos del Sitio Web a Excel
False
0.29
pomusl
0
7400
5
5
128, 128, 128
3
Solid
50
No
437
Commented
7/22/2020 10:09:20 AM
Muy bien!
fyux0ht
webscraping
matty_fu
t1_fyux0ht
https://www.reddit.com/r/webscraping/comments/hvr1dp/las_20_mejores_herramientas_de_web_scraping_para/fyux0ht/
7/22/2020 10:09:20 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hvr1dp
t3_hvr1dp
hvr1dp
0
hvr1dp
False
False
False
0
1
5
5
0
0
0
0
0
0
1
50
2
128, 128, 128
3
Solid
50
No
301
Commented
5/21/2020 4:45:44 PM
Si si muy bueno una cerveza por favor taco
frcw8mn
webscraping
autistic_alpha
t1_frcw8mn
https://www.reddit.com/r/webscraping/comments/gntedx/3_web_scraping_aplicaciones_para_ganar_dinero/frcw8mn/
5/21/2020 4:45:44 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gntedx
t3_gntedx
gntedx
0
gntedx
False
False
False
0
1
5
5
1
11.1111111111111
0
0
0
0
3
33.3333333333333
9
128, 128, 128
3
Solid
50
No
299
Commented
11/7/2022 8:56:41 AM
You could use [residential proxy](https://soax.com/?utm_source=social&utm_medium=reddit&utm_content=answer) to get IPs that you would rotate. There is a tutorial how it could be done: https://helpcenter.soax.com/en/articles/6342792-how-to-connect-soax-proxies-via-octoparse
ive6yzj
webscraping
AmandaKamen
t1_ive6yzj
https://www.reddit.com/r/webscraping/comments/m2hxcn/ip_rotation_with_octoparse/ive6yzj/
11/7/2022 8:56:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
m2hxcn
t3_m2hxcn
m2hxcn
0
m2hxcn
False
False
False
0
1
19
19
0
0
0
0
0
0
14
46.6666666666667
30
128, 128, 128
3
Solid
50
No
298
Commented
11/2/2022 2:40:22 PM
If you insist on using Octoparse as your scraper, there is a simple proxy IP rotation guide for it here:
[https://www.octoparse.com/tutorial-7/set-up-proxies#](https://www.octoparse.com/tutorial-7/set-up-proxies#)
First, you will need to subscribe to a [rotating proxy](https://brightdata.com/solutions/rotating-proxies) provider and get the proxy server's address and port number and enter thos in your Octoparse proxy settings.
iurk06u
webscraping
HumorMinimum1707
t1_iurk06u
https://www.reddit.com/r/webscraping/comments/m2hxcn/ip_rotation_with_octoparse/iurk06u/
11/2/2022 2:40:22 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
m2hxcn
t3_m2hxcn
m2hxcn
0
m2hxcn
False
False
False
0
1
19
19
0
0
0
0
0
0
37
51.3888888888889
72
128, 128, 128
3
Solid
50
No
297
Commented
3/26/2021 3:22:10 PM
You should avoid using VPNs as well as datacenter proxies for webscraping. As these IPs are used by many people at a time, chances of getting blocked are high.
For any kind of scraping, dedicated 4G mobile proxies are pretty useful. And these come with both manual and auto-rotation options as well so you'd be able to calculate for what amount of time you'd like to send requests from a particular IP and then rotate it after definite intervals. So you won't be sending too many requests from one IP and potentially reduce possibilities of passive agressive threats.
gsattt4
webscraping
ristoriel
t1_gsattt4
https://www.reddit.com/r/webscraping/comments/m2hxcn/ip_rotation_with_octoparse/gsattt4/
3/26/2021 3:22:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
m2hxcn
t3_m2hxcn
m2hxcn
0
m2hxcn
False
False
False
0
1
19
19
5
5.05050505050505
2
2.02020202020202
0
0
45
45.4545454545455
99
128, 128, 128
3
Solid
50
No
300
Posted
3/11/2021 4:38:50 AM
Hello comrades,
I am doing a data project and want to scrape a sizeable amount of data from a certain website. I'm still getting the hang of best practices as far as scraping is concerned; unfortunately I learned a hard lesson by not adding a delay in between inputs. I got a nice little 429 error along with a passive aggressive threat.
I don't want to repeat that error, so I've added a few seconds delay between some actions. However, I'm still worried my IP will get flagged and maybe blacklisted, which wouldn't be ideal. Octoparse provides an option to rotate IP's at a set interval, provided you enter in the address and port. I'm not entirely sure if you need a VPN service to access the IPs, so I went ahead and downloaded ProtonVPN. Unfortunately, I haven't quite pinned down what to enter in for the port (if it even matters).
If anyone has any experience with IP rotation in Octoparse or in general, some guidance would be greatly appreciated. If I failed to mention anything pertinent, ask away and I will fill in the gaps.
&#x200B;
Thanks
m2hxcn
webscraping
LiberalExpenditures
t3_m2hxcn
https://www.reddit.com/r/webscraping/comments/m2hxcn/ip_rotation_with_octoparse/
3/11/2021 4:38:50 AM
1/1/0001 12:00:00 AM
False
False
4
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
IP Rotation with Octoparse
False
0.75
m2hxcn
0
1
19
19
5
2.63157894736842
14
7.36842105263158
0
0
80
42.1052631578947
190
128, 128, 128
3
Solid
50
No
296
Commented
3/19/2021 2:00:55 PM
I have been struggling to find a solution for this as well. Yesterday someone told me about aws lambda functions. I put some code in there to log the IP address and looks like it changes every 5-20 minutes. So that might be a quick way to get started. I think the other thing to do would be to use EC2 and spin up new instances. But I have no idea how to do that, that’s why I went with Lambda functions. Piece of cake to use. If all else fails I’m thinking of hiring someone from up work
grhb772
webscraping
adrianhorning
t1_grhb772
https://www.reddit.com/r/webscraping/comments/m2hxcn/ip_rotation_with_octoparse/grhb772/
3/19/2021 2:00:55 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
m2hxcn
t3_m2hxcn
m2hxcn
0
m2hxcn
False
False
False
0
1
19
19
2
1.96078431372549
2
1.96078431372549
0
0
36
35.2941176470588
102
128, 128, 128
3
Solid
50
No
295
Commented
2/9/2021 2:20:16 AM
WritersAccess is a very good resource for good writers. I work in science/tech so the average writer can’t deliver what I need. There is also a concierge that helps you find the best suited writer.
gmnizy0
content_marketing
Samantha-diane
t1_gmnizy0
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gmnizy0/
2/9/2021 2:20:16 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lechaj
t3_lechaj
lechaj
0
lechaj
False
False
False
0
1
3
3
4
10.8108108108108
0
0
0
0
15
40.5405405405405
37
128, 128, 128
3
Solid
50
No
292
Commented
2/8/2021 3:51:34 PM
Sorry I'm a bit late... I think I (well really me and my friend's company) is exactly what you're looking for. Hopefully I'm not too late, but just let it be known that I'll send over my proposition shortly (hopefully by the end of the day!)
gmlawab
content_marketing
SuccessMysterious887
t1_gmlawab
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gmlawab/
2/8/2021 3:51:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lechaj
t3_lechaj
lechaj
0
lechaj
False
False
False
0
1
3
3
1
2.17391304347826
1
2.17391304347826
0
0
18
39.1304347826087
46
128, 128, 128
3
Solid
50
No
291
Commented
2/7/2021 10:16:43 AM
Will be sharing samples in DM.
gme8wkv
content_marketing
Dravodin
t1_gme8wkv
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gme8wkv/
2/7/2021 10:16:43 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lechaj
t3_lechaj
lechaj
0
lechaj
False
False
False
0
1
3
3
0
0
0
0
0
0
3
50
6
128, 128, 128
3
Solid
50
No
290
Commented
2/7/2021 8:20:59 AM
Put a post on LinkedIn and you'll be spammed
by Indian writers for at least a week
gmdd079
content_marketing
KingCapital-
t1_gmdd079
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gmdd079/
2/7/2021 8:20:59 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lechaj
t3_lechaj
lechaj
0
lechaj
False
False
False
0
1
3
3
0
0
0
0
0
0
6
35.2941176470588
17
128, 128, 128
3
Solid
50
No
289
Commented
2/7/2021 7:23:15 AM
I vibe with your post a lot.
It's hard to find quality writers. And if you do, the rates are high. You surely get what you pay for.
There was a time I was overwhelmed with work and I was looking for writers to outsource to, and I was completely shocked with the kind and samples that I was getting.
I had to tighten my belt and complete the project myself. If you get good writers out there( who are like diamonds to find) don't let them go.
It's true, freelancer these days is not a place to look for writers, PPH also.
But most people these days pay$0.03 per word citing long term collaborations, even for the most technical jobs, without considering the time and labour the writer has spent working on the job.
Sorry for the long post. In short, it's important for the employer to know the value of a good writer and pay them according to the quality of the work.
As I always see, you get what you pay for. :)
gmcyn24
content_marketing
writerDiana
t1_gmcyn24
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gmcyn24/
2/7/2021 7:23:15 AM
1/1/0001 12:00:00 AM
False
False
6
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lechaj
t3_lechaj
lechaj
0
lechaj
False
False
False
0
1
3
3
5
2.82485875706215
4
2.25988700564972
0
0
64
36.1581920903955
177
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
294
Commented
2/8/2021 8:24:44 PM
Another thing to think about is your payment setup. Instead of per word (where it seems you get a lot of fluff), look at paying per piece or on a contract for a specific amount of articles. Set amounts for over or under 500 words, 1200 words, etc. based on your need.
Back when I was freelance writing I had several people suggest I sign up for some of the online sites. The pay offers were not worth my time. The company I’m with now paid $35 for a press release I can guarantee you no journalists or bloggers would have bothered with. Rewriting it and getting it published got me my current content marketing job.
gmmbevq
content_marketing
DivaJanelle
t1_gmmbevq
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gmmbevq/
2/8/2021 8:24:44 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lechaj
t3_lechaj
lechaj
0
lechaj
False
False
False
0
4
3
3
2
1.70940170940171
1
0.854700854700855
0
0
54
46.1538461538462
117
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
293
Commented
2/7/2021 5:12:38 AM
Underemployed journalists laid off in the past 10 years. They might not have the base knowledge you are hoping for but with some direction... you can get the content you are looking for
gmcfxab
content_marketing
DivaJanelle
t1_gmcfxab
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gmcfxab/
2/7/2021 5:12:38 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lechaj
t3_lechaj
lechaj
1
lechaj
False
False
False
0
4
3
3
0
0
0
0
0
0
12
36.3636363636364
33
128, 128, 128
3
Solid
50
Yes
288
RepliedTo
2/7/2021 5:42:30 AM
I am just such a journalist, and I even know the knowledge.
The problem is that whenever there's a post like this they pay the absolute minimum and expect the absolute most advanced writing. They want knowledge of just the things noted above but insist on "we can start with $.03 a word to start but maybe go up from there!"
Nobody who's both very good at writing content and has the knowledge requested is going to write for such low rates.
It's the same problem I tell my consulting clients when they're looking for their "ninja" or "rockstar" coders when they complain that they can't find the right people — the people are out there, you just don't think that you should pay them what they're worth.
Don't get me wrong, I'm all for finding a good deal. I'm a bargain hunter as well. However when it comes to creative things — and writing is a creative thing, even when it's SEO-focused — you get what you pay for.
You are simply not going to get good writers at the sites like Fiverr and PeoplePerHour because people with the real skills aren't there, and that's because those sites do not pay enough money for those with the real chops.
gmcll06
content_marketing
SpaceForceAwakens
t1_gmcll06
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gmcll06/
2/7/2021 5:42:30 AM
1/1/0001 12:00:00 AM
False
False
15
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gmcfxab
t1_gmcfxab
gmcfxab
3
lechaj
False
False
False
1
1
3
3
12
5.76923076923077
4
1.92307692307692
0
0
70
33.6538461538462
208
128, 128, 128
3
Solid
50
Yes
287
RepliedTo
2/7/2021 5:46:16 AM
Amen and abso-f’ing-lutely.
gmcm933
content_marketing
DivaJanelle
t1_gmcm933
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gmcm933/
2/7/2021 5:46:16 AM
1/1/0001 12:00:00 AM
False
False
6
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gmcll06
t1_gmcll06
gmcll06
0
lechaj
False
False
False
2
1
3
3
0
0
0
0
0
0
4
66.6666666666667
6
128, 128, 128
3
Solid
50
No
286
RepliedTo
2/7/2021 11:53:17 AM
Exactly this, I have no issues finding writers, I use experienced journalists with expertise in the fields I'm commissioning for, but I pay for them. I've never used Fiverr or the like but I just don't think I would trust them, in the past I've had content written by agencies and that's been bad enough and ends up taking me so long to edit I could have just done it myself.
gmetwju
content_marketing
Pupniko
t1_gmetwju
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/gmetwju/
2/7/2021 11:53:17 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gmcll06
t1_gmcll06
gmcll06
0
lechaj
False
False
False
2
1
3
3
2
2.8169014084507
2
2.8169014084507
0
0
24
33.8028169014084
71
128, 128, 128
3
Solid
50
No
285
Posted
5/30/2022 11:48:37 AM
**ScraperAPI**
ScraperAPI is a solution for web scraper developers that handles proxies, browsers, and CAPTCHAs so that developers may extract raw HTML from any website with a single API request. ScraperAPI does not need you to manage your own proxies. Instead, it runs its own internal pool of hundreds of thousands of proxies from a dozen distinct proxy providers, as well as intelligent routing logic that directs requests via multiple subnets. It also automatically throttles queries to circumvent IP blocks and CAPTCHAs, increasing dependability. It's the ultimate online scraping solution for developers, with dedicated proxy pools for ecommerce pricing scraping, search engine scraping, social media scraping, sneaker crawling, ticket scraping, and much more! Start with the greatest online scraping API if you want to construct the best web scraper. If you need to scrape data from millions of pages each month, you may request a bulk discount using this form.
**Scrape-it. cloud**
Scrape-it. All data [web scraping api](https://scrape-it.cloud/blog/what-is-web-scraping-api) difficulties are handled by the cloud API. It's never been simpler to extract HTML from a website! It lets you specify your proxy location in order to show geo-targeted information. API employs a huge IP pool capable of handling even the most enormous web scraping job. It also helps you evade rate constraints on websites and conceal your scraping bot.
**ScrapeSimple**
ScrapeSimple is the ideal solution for anybody looking to have a unique web scraper tool made for them. It's as easy as filling out a form with the information you want. With a fully managed service that generates and maintains bespoke online scrapers for clients, ScrapeSimple lives up to its name and ranks towards the top of our list of simple web scraping tools.
**Octoparse**
Octoparse is an excellent scraper tool for users who wish to extract data from websites without needing to write, but yet having complete control over the whole process thanks to its simple user interface. Octoparse is one of the greatest screen scraping solutions for folks who don't want to learn how to write. It includes a point-and-click screen scraper that enables users to scrape behind login forms, fill out forms, enter search keywords, navigate across infinite scroll, render JavaScript, and
v0yfy1
WebDeveloper
Embarrassed_Law_253
t3_v0yfy1
https://www.reddit.com/r/WebDeveloper/comments/v0yfy1/best_web_scraping_tools_for_developers/
5/30/2022 11:48:37 AM
1/1/0001 12:00:00 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Best Web Scraping Tools for Developers
False
0.75
v0yfy1
0
1
2
2
12
3.16622691292876
5
1.31926121372032
0
0
214
56.4643799472296
379
128, 128, 128
3
Solid
50
No
284
Commented
6/16/2022 11:54:29 AM
Octoparse and Scraper API are decent, I haven't used Scrapesimple and Scrape-it (my use case requires better specs). I would also recommend [Data Collector](https://brightdata.grsm.io/vitariz-dca) and Apify.
icklhi6
WebDeveloper
Gidoneli
t1_icklhi6
https://www.reddit.com/r/WebDeveloper/comments/v0yfy1/best_web_scraping_tools_for_developers/icklhi6/
6/16/2022 11:54:29 AM
12/25/2022 9:15:53 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
v0yfy1
t3_v0yfy1
v0yfy1
0
v0yfy1
False
False
False
0
1
2
2
3
9.09090909090909
0
0
0
0
19
57.5757575757576
33
128, 128, 128
3
Solid
50
No
283
Commented
5/30/2022 1:33:48 PM
In this article, we would investigate the main web scratching tools accessible for use. These apparatuses are not organized in a particular request, but rather every one of them expressed here are exceptionally useful assets in the possession of their client.
While some would require coding abilities, some future order line-based devices,s and others would be graphical or point and snap web scratching instruments.
We should get into the main part of things:
1. import.io
2. Dexi.io
3. Octoparse
4. legs
5. Data scraping Studio
6. Crawl Monster
7. Mozenda
8. Selenium
9. Scrapy
iajjx6a
WebDeveloper
niraj06
t1_iajjx6a
https://www.reddit.com/r/WebDeveloper/comments/v0yfy1/best_web_scraping_tools_for_developers/iajjx6a/
5/30/2022 1:33:48 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
v0yfy1
t3_v0yfy1
v0yfy1
0
v0yfy1
False
False
False
0
1
2
2
3
3.06122448979592
1
1.02040816326531
0
0
47
47.9591836734694
98
128, 128, 128
3
Solid
50
No
282
Commented
11/23/2021 11:02:25 PM
Il y a énormément de variation dans les salaires, tout les postes pour un même langage ne se "valent" pas. De m^me entre les différents types de boite tu n'aura pas les mêmes salaires (et en tant que junior, ta formation pourra avoir un impact important).
Pour travailler dans le domaine à toulouse (coté data science mais dans des équipent variées), je te conseille de réfléchir avant tout au type de boulot qui t'intéresse plutôt qu'a juste cherche le langage qui paye le plus. Javascript c'est peu être très demandé mais si tu n'as ni intérêt ni appétence pour le front, c'est vraiment un mauvais plan. Idem pour plein de langage spécifique.
De même, un langage informatique n'est qu'une brique technique parmi d'autre pour un poste donné, c'est pas toujours le point le plus important (j'ai déjà vu des gens arriver a se former sur le tas à partir de 0 sur un langage parce qu'ils avaient déjà tous le bagage technique associé)
hltvyum
france
shinversus
t1_hltvyum
https://www.reddit.com/r/france/comments/qzkye6/les_langages_de_programmation_les_plus_demandés/hltvyum/
11/23/2021 11:02:25 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qzkye6
t3_qzkye6
qzkye6
0
qzkye6
False
False
False
0
1
13
13
2
1.21951219512195
2
1.21951219512195
0
0
93
56.7073170731707
164
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
281
Commented
11/23/2021 7:10:27 AM
Si tu aimes les problematiques autour de la data, python/r c est un super combo avec pas mal d'opportunités.
hlqqn01
france
viagrabrain
t1_hlqqn01
https://www.reddit.com/r/france/comments/qzkye6/les_langages_de_programmation_les_plus_demandés/hlqqn01/
11/23/2021 7:10:27 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qzkye6
t3_qzkye6
qzkye6
0
qzkye6
False
False
False
0
4
13
13
1
5
0
0
0
0
8
40
20
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
280
Commented
11/23/2021 7:09:17 AM
Note utile : il y a beauuucoup plus de propositions en full remote depuis la pandemie, ne pas hesiter a regarder en dehors de ta region, en region parisienne on cherche sur tout le territoire (et meme ailleurs) dans pas mal de boites.
hlqqjqt
france
viagrabrain
t1_hlqqjqt
https://www.reddit.com/r/france/comments/qzkye6/les_langages_de_programmation_les_plus_demandés/hlqqjqt/
11/23/2021 7:09:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qzkye6
t3_qzkye6
qzkye6
0
qzkye6
False
False
False
0
4
13
13
0
0
1
2.38095238095238
0
0
22
52.3809523809524
42
128, 128, 128
3
Solid
50
No
279
Commented
11/22/2021 6:51:19 PM
Gros doute sur l'article quand même.
Pas grand monde ne fait que du sql.
Du goland avant java ou c#, ça me semble étrange.
hlo5qk8
france
Live-Cover4440
t1_hlo5qk8
https://www.reddit.com/r/france/comments/qzkye6/les_langages_de_programmation_les_plus_demandés/hlo5qk8/
11/22/2021 6:51:19 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qzkye6
t3_qzkye6
qzkye6
0
qzkye6
False
False
False
0
1
13
13
1
4.16666666666667
0
0
0
0
14
58.3333333333333
24
128, 128, 128
3
Solid
50
No
278
Commented
11/22/2021 3:54:42 PM
Cet article est bidon, voila une vrai étude très complète et avec des données valables : [Emploi développeur 2020 : les langages les plus demandés et les mieux payés](https://emploi.developpez.com/actu/315699/Emploi-developpeur-2020-les-langages-les-plus-demandes-et-les-mieux-payes-Java-et-JavaScript-caracolent-en-tete-Kotlin-est-l-espoir-de-l-annee/)
hlnf701
france
Guidule
t1_hlnf701
https://www.reddit.com/r/france/comments/qzkye6/les_langages_de_programmation_les_plus_demandés/hlnf701/
11/22/2021 3:54:42 PM
1/1/0001 12:00:00 AM
False
False
6
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qzkye6
t3_qzkye6
qzkye6
0
qzkye6
False
False
False
0
1
13
13
0
0
0
0
0
0
37
63.7931034482759
58
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
345
Posted
11/22/2021 12:55:34 PM
Je suis tombé sur ce blog concernant les emplois pour les programmeurs en France, qui traite de certaines des raisons pour lesquelles les langages de programmation sont populaires. Ici c'est l'article: [*Les langages de programmation les plus demandés et les mieux payés en France*](https://www.octoparse.fr/blog/les-langages-de-programmation-les-plus-demandes-en-france).
Je me demandais si quelqu'un pouvait partager les salaires exacts? Je fais mes études à Paris mais j'espère travailler dans le sud, y aura-t-il une grande différence de salaire?
qzksyo
programmation
nanami2977
t3_qzksyo
https://www.reddit.com/r/programmation/comments/qzksyo/les_langages_de_programmation_les_plus_demandés/
11/22/2021 12:55:34 PM
1/1/0001 12:00:00 AM
False
False
8
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Les langages de programmation les plus demandés et les mieux payés en France?
False
0.9
qzksyo
0
4
13
13
0
0
0
0
0
0
52
58.4269662921348
89
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
344
Posted
11/22/2021 1:03:15 PM
Je suis tombé sur ce blog concernant les emplois pour les programmeurs en France, qui présente certaines des raisons pour lesquelles les langages de programmation sont populaires. Ici pour cet article: [*Les langages de programmation les plus demandés et les mieux payés en France*](https://www.octoparse.fr/blog/les-langages-de-programmation-les-plus-demandes-en-france).
Je me demandais si quelqu'un pouvait partager les salaires exacts? Je fais mes études à Paris mais j'espère travailler dans le sud(Toulouse par exemple), y aura-t-il une grande différence de salaire?
qzkye6
france
nanami2977
t3_qzkye6
https://www.reddit.com/r/france/comments/qzkye6/les_langages_de_programmation_les_plus_demandés/
11/22/2021 1:03:15 PM
1/1/0001 12:00:00 AM
False
False
4
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Les langages de programmation les plus demandés et les mieux payés en France? les salaires moyens?
False
0.61
qzkye6
0
4
13
13
0
0
0
0
0
0
55
59.7826086956522
92
128, 128, 128
3
Solid
50
No
277
Commented
11/22/2021 3:19:47 PM
Si tu connais un langage de niche genre Cobol ou Rust et que tu maîtrises à fond, tu peux négocier de gros salaire car ces compétences sont très recherché en hardware
hlna9iy
france
Snykeurs
t1_hlna9iy
https://www.reddit.com/r/france/comments/qzkye6/les_langages_de_programmation_les_plus_demandés/hlna9iy/
11/22/2021 3:19:47 PM
1/1/0001 12:00:00 AM
False
False
10
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qzkye6
t3_qzkye6
qzkye6
1
qzkye6
False
False
False
0
1
13
13
1
3.2258064516129
1
3.2258064516129
0
0
17
54.8387096774194
31
128, 128, 128
3
Solid
50
No
276
RepliedTo
11/22/2021 6:35:07 PM
Et moi sur TCL/TK pour les outils de CAO hardware ^^
hlo3afj
france
bastimars
t1_hlo3afj
https://www.reddit.com/r/france/comments/qzkye6/les_langages_de_programmation_les_plus_demandés/hlo3afj/
11/22/2021 6:35:07 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hlna9iy
t1_hlna9iy
hlna9iy
0
qzkye6
False
False
False
1
1
13
13
0
0
0
0
0
0
5
45.4545454545455
11
128, 128, 128
3
Solid
50
No
275
Commented
6/21/2021 4:36:59 AM
You could do the same with a single wget on the commandline.
h2i7754
u_Octoparseideas
Kaligule
t1_h2i7754
https://www.reddit.com/r/u_Octoparseideas/comments/o4mz0r/how_to_download_images_from_url_list/h2i7754/
6/21/2021 4:36:59 AM
1/1/0001 12:00:00 AM
False
False
7
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o4mz0r
t3_o4mz0r
o4mz0r
0
o4mz0r
False
False
False
0
1
3
3
0
0
0
0
0
0
4
33.3333333333333
12
128, 128, 128
3
Solid
50
No
274
Posted
11/1/2022 9:22:08 PM
Hello! I'm fairly new to web scraping, and have recently encountered an issue with Octoparse's infinite scroll data gathering feature.
Here's the site I'm trying to scrape: [https://search.dca.ca.gov/](https://search.dca.ca.gov/)
I set the workflow to select the Dental Board of California option and click Search,
then check the conscious sedation permit box in License Type on the left,
then auto-detect webpage data (it does this as desired),
Here's where I imagine I'm missing something. At this step of the process, the Octoparse Help Center recommends either instructing the program to automatically deal with infinite scroll or manually instruct it to scroll down and scrape as it goes for X number of pages. ([https://helpcenter.octoparse.com/octoparse/en/articles/6470993-dealing-with-pagination-infinite-scroll](https://helpcenter.octoparse.com/octoparse/en/articles/6470993-dealing-with-pagination-infinite-scroll))
I've tried both approaches, and the scrape inevitably gets stuck at 150 lines of data each time. I've tried adding wait time, adding additional scrolling, retrying after a restart... nothing seems to work! I have verified that there are >150 lines of data to harvest. Any of you savvy scrapers out there know a solution for this?
I'm also open to suggestions as far as alternative software/skills goes.
TL;DR - Octoparse Infinite Scroll stuck @ 150 data - how do I fix?
yjlscq
webscraping
Representative_Art71
t3_yjlscq
https://www.reddit.com/r/webscraping/comments/yjlscq/octoparse_infinite_scroll_problem/
11/1/2022 9:22:08 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse Infinite Scroll Problem
False
1
yjlscq
0
1
64
64
3
1.31578947368421
4
1.75438596491228
0
0
132
57.8947368421053
228
128, 128, 128
3
Solid
50
No
273
Commented
11/15/2022 8:55:37 AM
I am having the same problem. Did you find the solution?
iwftgmx
webscraping
Mikeyandwind
t1_iwftgmx
https://www.reddit.com/r/webscraping/comments/yjlscq/octoparse_infinite_scroll_problem/iwftgmx/
11/15/2022 8:55:37 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
yjlscq
t3_yjlscq
yjlscq
0
yjlscq
False
False
False
0
1
64
64
0
0
1
9.09090909090909
0
0
4
36.3636363636364
11
128, 128, 128
3
Solid
50
No
272
Posted
7/27/2022 5:07:30 AM
https://hamadcrack.com/octoparse-crack/
w95rw1
windows
Lost_Buy_8044
t3_w95rw1
https://www.reddit.com/r/windows/comments/w95rw1/octoparse_crack_852_product_key_full_download_2023/
7/27/2022 5:07:30 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse Crack 8.5.2 Product Key Full Download {2023}
False
1
w95rw1
0
1
1
1
128, 128, 128
3
Solid
50
No
271
Posted
4/17/2022 8:59:33 AM
I don't use octoparse very often but if there are any experts around, I would appreciate a quick pointer!
u5j5eb
webscraping
theopinionexpert
t3_u5j5eb
https://www.reddit.com/r/webscraping/comments/u5j5eb/octoparse_detects_information_in_click_element_in/
4/17/2022 8:59:33 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse detects information in click element in the preview but doesn't extract it
False
0.67
u5j5eb
0
1
1
1
1
5.26315789473684
0
0
0
0
7
36.8421052631579
19
128, 128, 128
3
Solid
50
No
270
Posted
6/23/2022 5:48:28 AM
[removed]
vipoai
PiratedGames
Emergency-Nose-2980
t3_vipoai
https://www.reddit.com/r/PiratedGames/comments/vipoai/octoparse_842_crack_with_activation_key_x64_free/
6/23/2022 5:48:28 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse 8.4.2 Crack With Activation Key (x64) Free Download 2022
False
1
vipoai
0
1
9
9
0
0
0
0
0
0
1
100
1
128, 128, 128
3
Solid
50
No
269
Commented
6/23/2022 5:48:28 AM
Make sure to read the stickied [megathread](https://rentry.org/pgames-mega-thread), as it might just answer your question! Also check out our [videogame piracy guide](https://www.reddit.com/r/PiratedGames/comments/i3r14g/a_beginners_guide_to_video_game_piracy/) and the list of Common Q&A [part 1](https://www.reddit.com/r/PiratedGames/comments/fvix6e/common_questions_and_answers_thread/) and [part 2](https://www.reddit.com/r/PiratedGames/comments/igxebs/frequently_asked_questions_part_2/). Or just read the whole [Wiki](https://www.reddit.com/r/PiratedGames/wiki/index).
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/PiratedGames) if you have any questions or concerns.*
idedk8c
PiratedGames
AutoModerator
t1_idedk8c
https://www.reddit.com/r/PiratedGames/comments/vipoai/octoparse_842_crack_with_activation_key_x64_free/idedk8c/
6/23/2022 5:48:28 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
vipoai
t3_vipoai
vipoai
0
vipoai
False
False
True
0
1
9
9
0
0
1
0.917431192660551
0
0
54
49.5412844036697
109
128, 128, 128
3
Solid
50
No
268
Posted
5/31/2022 1:23:58 AM
https://xilisoftcrack.com/octoparse-crack/
v1eudc
FreeKarma4You
Remarkable-Quail146
t3_v1eudc
https://www.reddit.com/r/FreeKarma4You/comments/v1eudc/octoparse_850_crack_windows_x/
5/31/2022 1:23:58 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse 8.5.0 Crack Windows (x
True
1
v1eudc
0
1
1
1
128, 128, 128
3
Solid
50
No
267
RepliedTo
10/10/2020 3:53:19 PM
Beep. Boop. I'm a robot.
Here's a copy of
###[1984](https://snewd.com/ebooks/1984-george-orwell/)
Was I a good bot? | [info](https://www.reddit.com/user/Reddit-Book-Bot/) | [More Books](https://old.reddit.com/user/Reddit-Book-Bot/comments/i15x1d/full_list_of_books_and_commands/)
g8c05ux
learnpython
Reddit-Book-Bot
t1_g8c05ux
https://www.reddit.com/r/learnpython/comments/j82e18/using_octoparse_to_scrape_comments/g8c05ux/
10/10/2020 3:53:19 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g8c04p4
t1_g8c04p4
g8c04p4
0
j82e18
False
False
False
5
1
12
12
1
2.27272727272727
0
0
0
0
29
65.9090909090909
44
128, 128, 128
3
Solid
50
No
266
Posted
1/15/2023 11:15:42 PM
hello, i'm trying to make octoparse click on "next page" to continue the scraping. I tried to create a loop with pagination but it never works... the result is that the first page is correctly scraped, but the second, third, fourth, etc. are not scraped et the first page continue to extract duplicated datas.
someone knows where the problem comes from ?
10cy6ml
webscraping
Inventeurduzdong
t3_10cy6ml
https://www.reddit.com/r/webscraping/comments/10cy6ml/octoparse_pagination_next_page/
1/15/2023 11:15:42 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse pagination "next page"
False
1
10cy6ml
0
1
1
1
2
3.27868852459016
1
1.63934426229508
0
0
29
47.5409836065574
61
128, 128, 128
3
Solid
50
Yes
264
Commented
3/5/2023 2:31:56 PM
I'd you need a tutor for this, this space isn't for you.
We scraping and other programming based activities require work and research. Not just somebody teach me.
You'll likely rely on your tutor for much more after you're done being tutored.
jb0ighi
webscraping
just-sum-dude69
t1_jb0ighi
https://www.reddit.com/r/webscraping/comments/11i4irm/hiring_a_tutor_for_octoparse/jb0ighi/
3/5/2023 2:31:56 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
11i4irm
t3_11i4irm
11i4irm
1
11i4irm
False
False
False
0
1
32
32
1
2.38095238095238
0
0
0
0
18
42.8571428571429
42
128, 128, 128
3
Solid
50
Yes
263
RepliedTo
3/5/2023 9:35:14 PM
I realize this, but I learn better when being shown by someone. I don’t see how it is a bad thing to ask for help. Especially if I’m willing to pay someone.
jb270ej
webscraping
mostlybeak
t1_jb270ej
https://www.reddit.com/r/webscraping/comments/11i4irm/hiring_a_tutor_for_octoparse/jb270ej/
3/5/2023 9:35:14 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jb0ighi
t1_jb0ighi
jb0ighi
2
11i4irm
True
False
False
1
1
32
32
2
5.88235294117647
1
2.94117647058824
0
0
13
38.2352941176471
34
128, 128, 128
3
Solid
50
No
262
RepliedTo
3/7/2023 4:12:39 PM
I've taught in several web scraping classes before for people who are technical or non-technical. Have you solved your workflow issue already?
jba6wbw
webscraping
omnipotentsoul
t1_jba6wbw
https://www.reddit.com/r/webscraping/comments/11i4irm/hiring_a_tutor_for_octoparse/jba6wbw/
3/7/2023 4:12:39 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jb270ej
t1_jb270ej
jb270ej
0
11i4irm
False
False
False
2
1
32
32
0
0
1
4.34782608695652
0
0
13
56.5217391304348
23
128, 128, 128
3
Solid
50
Yes
261
RepliedTo
3/6/2023 2:59:15 PM
Are you only interested in learning how to scrape with Octoparse? There are other web scraping tools that offer full tutorials and amazing content on web scraping: [https://brightdata.com/discovery-zone](https://brightdata.com/discovery-zone)
jb56qe7
webscraping
noah_bd
t1_jb56qe7
https://www.reddit.com/r/webscraping/comments/11i4irm/hiring_a_tutor_for_octoparse/jb56qe7/
3/6/2023 2:59:15 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jb270ej
t1_jb270ej
jb270ej
1
11i4irm
False
False
False
2
1
32
32
1
2.7027027027027
0
0
0
0
19
51.3513513513514
37
128, 128, 128
3
Solid
50
Yes
260
RepliedTo
3/6/2023 5:20:01 PM
Thank you!
Not necessarily, I’m just trying to solve a particular problem and it seemed like Octoparse would be an easy way to go
jb5rmdk
webscraping
mostlybeak
t1_jb5rmdk
https://www.reddit.com/r/webscraping/comments/11i4irm/hiring_a_tutor_for_octoparse/jb5rmdk/
3/6/2023 5:20:01 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jb56qe7
t1_jb56qe7
jb56qe7
0
11i4irm
True
False
False
3
1
32
32
2
8
1
4
0
0
8
32
25
128, 128, 128
3
Solid
50
No
265
Posted
3/4/2023 4:29:52 PM
I’m looking to hire someone to walk me through how to set up a couple workflows in Octoparse.
Does anyone here do that?
11i4irm
webscraping
mostlybeak
t3_11i4irm
https://www.reddit.com/r/webscraping/comments/11i4irm/hiring_a_tutor_for_octoparse/
3/4/2023 4:29:52 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Hiring a Tutor for Octoparse
False
1
11i4irm
0
1
32
32
0
0
0
0
0
0
9
37.5
24
128, 128, 128
3
Solid
50
No
259
RepliedTo
4/28/2020 5:03:50 PM
Maybe a graphdb could help here as well
fov0kuy
webscraping
Miserable_Author
t1_fov0kuy
https://www.reddit.com/r/webscraping/comments/g9n330/help_with_hopefully_simple_unstructured_web/fov0kuy/
4/28/2020 5:03:50 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fov0gs9
t1_fov0gs9
fov0gs9
0
g9n330
False
False
False
1
1
1
1
1
12.5
0
0
0
0
3
37.5
8
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
258
RepliedTo
4/28/2020 1:30:33 PM
Thanks for the feedback. I am not looking for help to build a scraping tool or any kind of manual work. That would be outrageous and I am totally with you!
What I meant is that if you say: this is the tutorial you are looking for or this is a good tool or an easy approach (I don’t have much programming background) and I manage to get it to work, I’d be more than happy to send that money their way for literally 5-10 min of work.
foubaon
webscraping
Iam_the_analyst
t1_foubaon
https://www.reddit.com/r/webscraping/comments/g9n330/help_with_hopefully_simple_unstructured_web/foubaon/
4/28/2020 1:30:33 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fouaysd
t1_fouaysd
fouaysd
1
g9n330
False
False
False
1
4
65
65
6
6.59340659340659
1
1.0989010989011
0
0
27
29.6703296703297
91
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
257
RepliedTo
4/28/2020 3:18:01 PM
No problem. I gotcha now. It all depends on what language you’re going to be using.
foun5ub
webscraping
Vote4flipflop
t1_foun5ub
https://www.reddit.com/r/webscraping/comments/g9n330/help_with_hopefully_simple_unstructured_web/foun5ub/
4/28/2020 3:18:01 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
foubaon
t1_foubaon
foubaon
1
g9n330
False
False
False
2
4
65
65
0
0
1
5.88235294117647
0
0
6
35.2941176470588
17
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
256
RepliedTo
4/28/2020 3:20:56 PM
That’s why appreciate the feedback and I wasn’t hoping to have to build all from scratch but even if there was an easy search engine mod or so that I could use - I’d pay the person (I know it is based on trust) if they lead me to it.
founin2
webscraping
Iam_the_analyst
t1_founin2
https://www.reddit.com/r/webscraping/comments/g9n330/help_with_hopefully_simple_unstructured_web/founin2/
4/28/2020 3:20:56 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
foun5ub
t1_foun5ub
foun5ub
1
g9n330
False
False
False
3
4
65
65
4
7.69230769230769
1
1.92307692307692
0
0
12
23.0769230769231
52
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
255
RepliedTo
4/28/2020 3:24:54 PM
I don’t know much about search engine mods to scrape or interact with pages. I usually write my stuff from scratch and it’s usually specific to a site or few but not 5000 sites.
founzx8
webscraping
Vote4flipflop
t1_founzx8
https://www.reddit.com/r/webscraping/comments/g9n330/help_with_hopefully_simple_unstructured_web/founzx8/
4/28/2020 3:24:54 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
founin2
t1_founin2
founin2
0
g9n330
False
False
False
4
4
65
65
0
0
1
2.77777777777778
0
0
17
47.2222222222222
36
128, 128, 128
3
Solid
50
No
254
RepliedTo
10/28/2021 6:54:22 PM
JavaScript est du Typescript, Typescript n’est pas nécessairement du JavaScript. C’est ce que on appel un super ensemble ou superset.
C++ est un superset de C, pourtant personne ne dit que C++ c’est du C donc ce n’est pas un langage.
Il en existe plusieurs autres comme Objective-C.
Certains employeurs engagent spécifiquement des développeurs Typescript. Donc oui, je crois que c’est legit de séparer Js et Ts.
hiez98o
Quebec
cdash04
t1_hiez98o
https://www.reddit.com/r/Quebec/comments/qhdjti/cest_vrai_les_15_langages_de_programmation_les/hiez98o/
10/28/2021 6:54:22 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hicfvpe
t1_hicfvpe
hicfvpe
0
qhdjti
False
False
False
1
1
66
66
1
1.36986301369863
0
0
0
0
34
46.5753424657534
73
128, 128, 128
3
Solid
50
No
253
RepliedTo
10/28/2021 10:57:03 AM
> consistant
constant
hid6r1r
Quebec
kornikopic
t1_hid6r1r
https://www.reddit.com/r/Quebec/comments/qhdjti/cest_vrai_les_15_langages_de_programmation_les/hid6r1r/
10/28/2021 10:57:03 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hiccscu
t1_hiccscu
hiccscu
0
qhdjti
False
False
False
1
1
67
67
0
0
0
0
0
0
2
100
2
128, 128, 128
3
Solid
50
No
252
RepliedTo
10/28/2021 6:46:00 AM
Pour honnête je peux croire que du LAMP soit dans le top cinq: c'est vieux, c'est chiant, mais crisse, c'est encore partout.
hiconp7
Quebec
Regular-Exchange8376
t1_hiconp7
https://www.reddit.com/r/Quebec/comments/qhdjti/cest_vrai_les_15_langages_de_programmation_les/hiconp7/
10/28/2021 6:46:00 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hicac8f
t1_hicac8f
hicac8f
0
qhdjti
False
False
False
2
1
68
68
1
4.54545454545455
0
0
0
0
15
68.1818181818182
22
128, 128, 128
3
Solid
50
No
251
Commented
7/30/2021 9:22:55 PM
Saved this! Thanks for the awesome write up. I am in need of some data so this might be the step in the right direction I need. Mind I I hit up your DMs later if I need some help?
h74qziz
webscraping
Eranthius
t1_h74qziz
https://www.reddit.com/r/webscraping/comments/o6o8g7/top_5_scraping_tools_for_beginners_importio_vs/h74qziz/
7/30/2021 9:22:55 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o6o8g7
t3_o6o8g7
o6o8g7
0
o6o8g7
False
False
False
0
1
2
2
2
5
0
0
0
0
14
35
40
128, 128, 128
3
Solid
50
Yes
250
Commented
6/30/2021 10:31:26 PM
**Disclaimer: I do work for SerpApi**
SerpApi handles many Search Engine Result Pages. For example Google, Baidu, Bing, Yahoo!, Yandex, Ebay, YouTube, Walmart and The Home Depot
,
Reach out to Customer Support if you are looking for options to scrape Google: contact@serpapi.com
Here's our playground demonstration area that shows how and what we are able to scrape:
[https://serpapi.com/playground](https://serpapi.com/playground)?
h3m594x
webscraping
justinSerpApi
t1_h3m594x
https://www.reddit.com/r/webscraping/comments/o6o8g7/top_5_scraping_tools_for_beginners_importio_vs/h3m594x/
6/30/2021 10:31:26 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o6o8g7
t3_o6o8g7
o6o8g7
1
o6o8g7
False
False
False
0
1
2
2
2
2.98507462686567
0
0
0
0
37
55.2238805970149
67
128, 128, 128
3
Solid
50
Yes
249
RepliedTo
7/1/2021 12:29:21 AM
Nice! Thanks for share :)
h3mjcry
webscraping
carlpaul153
t1_h3mjcry
https://www.reddit.com/r/webscraping/comments/o6o8g7/top_5_scraping_tools_for_beginners_importio_vs/h3mjcry/
7/1/2021 12:29:21 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h3m594x
t1_h3m594x
h3m594x
0
o6o8g7
True
False
False
1
1
2
2
1
25
0
0
0
0
2
50
4
128, 128, 128
3
Solid
50
Yes
248
Commented
6/29/2021 6:07:13 PM
If I have completely null programming experience and skills could I use these softwares? Or what I have to learn to use them?
h3gsc0i
webscraping
propilmetil
t1_h3gsc0i
https://www.reddit.com/r/webscraping/comments/o6o8g7/top_5_scraping_tools_for_beginners_importio_vs/h3gsc0i/
6/29/2021 6:07:13 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o6o8g7
t3_o6o8g7
o6o8g7
1
o6o8g7
False
False
False
0
1
2
2
0
0
0
0
0
0
9
39.1304347826087
23
128, 128, 128
3
Solid
50
Yes
247
RepliedTo
6/29/2021 10:51:46 PM
They are designed for people like you! Some are particularly straightforward. Try it and you will see ;)
h3hunzh
webscraping
carlpaul153
t1_h3hunzh
https://www.reddit.com/r/webscraping/comments/o6o8g7/top_5_scraping_tools_for_beginners_importio_vs/h3hunzh/
6/29/2021 10:51:46 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h3gsc0i
t1_h3gsc0i
h3gsc0i
0
o6o8g7
True
False
False
1
1
2
2
1
5.88235294117647
0
0
0
0
5
29.4117647058824
17
128, 128, 128
3
Solid
50
No
246
Posted
6/30/2021 11:40:39 AM
Hello. I am trying to extract data from multiple pages with Octoparse 8. I have watched [https://www.youtube.com/watch?v=7I5O53SZ6dY](https://www.youtube.com/watch?v=7I5O53SZ6dY) before trying. I am on the free plan if this is the reason. So....I want to extract the information from this site [http://public.ciab-bg.com/index.php?action=registrar&setElsPerPage=50&account\_type=personal&page=1](http://public.ciab-bg.com/index.php?action=registrar&setElsPerPage=50&account_type=personal&page=1) , they are 63 pages. But I am downloading only 3 (the first two pages and the last one). I am using the button Paginate but it's not helping me. :(
Thanks in advance.
oavipa
webscraping
propilmetil
t3_oavipa
https://www.reddit.com/r/webscraping/comments/oavipa/how_to_extract_data_from_multiple_pages_with/
6/30/2021 11:40:39 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to extract data from multiple pages with Octoparse 8.
False
1
oavipa
0
1
2
2
2
1.63934426229508
0
0
0
0
54
44.2622950819672
122
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
245
Commented
7/1/2021 9:16:13 AM
Sounds like you're being blocked...are you using rotating [residential proxies](https://brightdata.com/proxy-types/rotating-residential-ips)?
h3nwmb8
webscraping
Gidoneli
t1_h3nwmb8
https://www.reddit.com/r/webscraping/comments/oavipa/how_to_extract_data_from_multiple_pages_with/h3nwmb8/
7/1/2021 9:16:13 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
oavipa
t3_oavipa
oavipa
1
oavipa
False
False
False
0
2
2
2
0
0
0
0
0
0
13
68.4210526315789
19
128, 128, 128
3
Solid
50
Yes
244
RepliedTo
7/1/2021 11:03:06 AM
no, what is residential proxies and what are the benefits from it? Have in mind that I am from Bulgaria and I am trying to extract data from bulgarian website.
h3o43op
webscraping
propilmetil
t1_h3o43op
https://www.reddit.com/r/webscraping/comments/oavipa/how_to_extract_data_from_multiple_pages_with/h3o43op/
7/1/2021 11:03:06 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h3nwmb8
t1_h3nwmb8
h3nwmb8
1
oavipa
True
False
False
1
1
2
2
1
3.33333333333333
0
0
0
0
9
30
30
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
243
RepliedTo
7/1/2021 2:59:49 PM
GEO blocks is just one part of the equation.
If you are exceeding the request rate limit of the website it will detect you as a bot/crawler.
Using a rotating residential proxy network (see link above I shared) to route your crawler requests through will make the website think your a normal user (or several of them) just browsing the website (from real user IPs) and not block you.
In case it is a really advanced website with further defenses you need a proxy with unblocking technology like [this one](https://brightdata.grsm.io/vitariz-unlocker).
h3ou2e1
webscraping
Gidoneli
t1_h3ou2e1
https://www.reddit.com/r/webscraping/comments/oavipa/how_to_extract_data_from_multiple_pages_with/h3ou2e1/
7/1/2021 2:59:49 PM
12/27/2022 5:34:57 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h3o43op
t1_h3o43op
h3o43op
0
oavipa
False
False
False
2
2
2
2
2
2.06185567010309
1
1.03092783505155
0
0
49
50.5154639175258
97
128, 128, 128
3
Solid
50
Yes
242
Commented
6/24/2021 12:08:30 AM
Thank you for the write up!
h2tvcaw
webscraping
SolAlliance
t1_h2tvcaw
https://www.reddit.com/r/webscraping/comments/o6o8g7/top_5_scraping_tools_for_beginners_importio_vs/h2tvcaw/
6/24/2021 12:08:30 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o6o8g7
t3_o6o8g7
o6o8g7
1
o6o8g7
False
False
False
0
1
2
2
1
16.6666666666667
0
0
0
0
1
16.6666666666667
6
128, 128, 128
3
Solid
50
Yes
241
RepliedTo
6/24/2021 12:45:55 AM
thanks for thanking :)
h2tzjdf
webscraping
carlpaul153
t1_h2tzjdf
https://www.reddit.com/r/webscraping/comments/o6o8g7/top_5_scraping_tools_for_beginners_importio_vs/h2tzjdf/
6/24/2021 12:45:55 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h2tvcaw
t1_h2tvcaw
h2tvcaw
0
o6o8g7
True
False
False
1
1
2
2
0
0
0
0
0
0
2
66.6666666666667
3
128, 128, 128
3.0236518448439
Dash Dot Dot
49.898634950669
No
613
Posted
6/23/2021 10:30:41 PM
[removed]
o6nn4k
datascience
carlpaul153
t3_o6nn4k
https://www.reddit.com/r/datascience/comments/o6nn4k/top_5_scraping_tools_for_beginners_importio_vs/
6/23/2021 10:30:41 PM
1/1/0001 12:00:00 AM
False
False
16
4
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 scraping tools for beginners: Import.io vs Octoparse vs Mozenda vs ParseHub vs Dexi.io.
False
0.81
o6nn4k
0
26
2
2
0
0
0
0
0
0
1
100
1
128, 128, 128
3.0236518448439
Dash Dot Dot
49.898634950669
No
612
Posted
6/23/2021 10:53:08 PM
[removed]
o6o1r9
learnprogramming
carlpaul153
t3_o6o1r9
https://www.reddit.com/r/learnprogramming/comments/o6o1r9/top_5_scraping_tools_for_beginners_importio_vs/
6/23/2021 10:53:08 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 scraping tools for beginners: Import.io vs Octoparse vs Mozenda vs ParseHub vs Dexi.io.
False
0.5
o6o1r9
0
26
2
2
0
0
0
0
0
0
1
100
1
128, 128, 128
3.0236518448439
Dash Dot Dot
49.898634950669
No
611
Posted
6/23/2021 10:57:11 PM
[removed]
o6o4a3
dropship
carlpaul153
t3_o6o4a3
https://www.reddit.com/r/dropship/comments/o6o4a3/top_5_scraping_tools_for_beginners_importio_vs/
6/23/2021 10:57:11 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 scraping tools for beginners: Import.io vs Octoparse vs Mozenda vs ParseHub vs Dexi.io
False
1
o6o4a3
0
26
2
2
0
0
0
0
0
0
1
100
1
128, 128, 128
3.0236518448439
Dash Dot Dot
49.898634950669
No
610
Posted
6/23/2021 10:36:56 PM
[removed]
o6nrcv
ecommerce
carlpaul153
t3_o6nrcv
https://www.reddit.com/r/ecommerce/comments/o6nrcv/top_5_scraping_tools_for_beginners_importio_vs/
6/23/2021 10:36:56 PM
1/1/0001 12:00:00 AM
False
False
9
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 scraping tools for beginners: Import.io vs Octoparse vs Mozenda vs ParseHub vs Dexi.io.
False
0.91
o6nrcv
0
26
2
2
0
0
0
0
0
0
1
100
1
128, 128, 128
3.0236518448439
Dash Dot Dot
49.898634950669
No
609
Commented
6/23/2021 10:46:36 PM
Here are the links: [Import.io](https://Import.io), [Octoparse](https://www.octoparse.com/signup?re=FudAl6IU), [Mozenda](https://www.mozenda.com/), [ParseHub](https://www.parsehub.com/), [Dexi.io](https://Dexi.io).
Clarification: the link to octoparse is an invitation referral (not affiliated). If this post has served you well, consider signing up using it. I receive a free premium month, and you get a 30% discount (until June 25) :D
h2tm25k
ecommerce
carlpaul153
t1_h2tm25k
https://www.reddit.com/r/ecommerce/comments/o6nrcv/top_5_scraping_tools_for_beginners_importio_vs/h2tm25k/
6/23/2021 10:46:36 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o6nrcv
t3_o6nrcv
o6nrcv
1
o6nrcv
True
False
False
0
26
2
2
2
2.8169014084507
0
0
0
0
36
50.7042253521127
71
128, 128, 128
3.0236518448439
Dash Dot Dot
49.898634950669
No
608
Posted
6/23/2021 11:03:29 PM
I have selected the most popular web scraping tools that are friendly for people with little programming skills. Don't be fooled by their simplicity, some of them also support advanced programmable functions.
I have ordered them according to my personal preference (favorites at the end). Of course, this is an opinion and I recommend you do your own research.
* [Import.io](https://www.import.io/): has gained popularity for the way it automatically converts any website into structured data and for its nice interface. Although it can be useful with simple web structures, it is not very good for various types of websites.
* [Dexi.io](https://www.dexi.io/): similar in usability to Parsehub. Requires more advanced programming skills compared to the following scrapers. Has three types of robots available: extractor, crawler, pipes.
* [Parsehub](https://www.parsehub.com/): it can deal with complicated scenarios. Although it is intended to offer an easy web scraping experience, a typical user will still need to be a bit technical to fully understand many of its advanced functionalities.
* [Mozenda](https://www.mozenda.com/): it is one of the "oldest" web scraping software on the market. It has an attractive user interface, and very powerful and advanced options. There is not much to criticize it, except..... It's very expensive... and there's no free version :(
* [Octoparse](https://www.octoparse.com/): this is my favorite. Like Mozenda it is very simple to use and has powerful advanced options. It guesses the fields surprisingly well, so it saves a lot of time. If this guide is helping you and you are interested in this tool, I ask you to register from [here](https://www.octoparse.com/signup?re=FudAl6IU), so I will get 1 month free of the pro version and you a 30% discount (until June 25).
o6o8g7
webscraping
carlpaul153
t3_o6o8g7
https://www.reddit.com/r/webscraping/comments/o6o8g7/top_5_scraping_tools_for_beginners_importio_vs/
6/23/2021 11:03:29 PM
1/1/0001 12:00:00 AM
False
False
10
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 scraping tools for beginners: Import.io vs Octoparse vs Mozenda vs ParseHub vs Dexi.io
False
0.82
o6o8g7
0
26
2
2
23
7.66666666666667
4
1.33333333333333
0
0
126
42
300
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
240
Posted
3/13/2017 6:42:33 PM
[removed]
5z6t5f
scrapinghub
Andre380
t3_5z6t5f
https://www.reddit.com/r/scrapinghub/comments/5z6t5f/web_scraping_using_octoparse/
3/13/2017 6:42:33 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping using OCTOPARSE
False
1
5z6t5f
0
4
1
1
0
0
0
0
0
0
1
100
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
239
Posted
3/13/2017 6:26:37 PM
[removed]
5z6pgj
scrapingtheweb
Andre380
t3_5z6pgj
https://www.reddit.com/r/scrapingtheweb/comments/5z6pgj/web_scrape_with_octoparse/
3/13/2017 6:26:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scrape with OCTOPARSE
False
1
5z6pgj
0
4
1
1
0
0
0
0
0
0
1
100
1
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
236
Commented
10/14/2021 12:27:51 AM
Easy pz- look into webscraping. I’d recommend a python script that does each of those steps you listed out. Beautiful soup4 is the library you’d want in python. Lots of YouTube tutorials for this
Basically-
Use the URLs for the sites you want to access
Then use some kind of identifier(usually an ID tag, but sometimes have to use class).
Then grab the data load into a csv or database
Then populate a html page with this table.
Python, html, bootstrap are basically all you’d need
hgjwxtw
webdev
mud002
t1_hgjwxtw
https://www.reddit.com/r/webdev/comments/q7ndfx/how_to_auto_populate_notes_for_cars_on_an_online/hgjwxtw/
10/14/2021 12:27:51 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
q7ndfx
t3_q7ndfx
q7ndfx
1
q7ndfx
False
False
False
0
2
33
33
3
3.33333333333333
0
0
0
0
42
46.6666666666667
90
128, 128, 128
3
Solid
50
Yes
235
RepliedTo
10/14/2021 2:49:04 AM
Can you recommend any YouTube videos for it?
hgkeh2j
webdev
theofficialjewses
t1_hgkeh2j
https://www.reddit.com/r/webdev/comments/q7ndfx/how_to_auto_populate_notes_for_cars_on_an_online/hgkeh2j/
10/14/2021 2:49:04 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hgjwxtw
t1_hgjwxtw
hgjwxtw
2
q7ndfx
True
False
False
1
1
33
33
1
12.5
0
0
0
0
2
25
8
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
234
RepliedTo
10/14/2021 2:54:09 AM
This is what I used when I was playing with it all. Also learned a bit about virtual environment setup. There’s also a post on medium.com or something about this, was very useful.
https://realpython.com/beautiful-soup-web-scraper-python/
hgkf3d3
webdev
mud002
t1_hgkf3d3
https://www.reddit.com/r/webdev/comments/q7ndfx/how_to_auto_populate_notes_for_cars_on_an_online/hgkf3d3/
10/14/2021 2:54:09 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hgkeh2j
t1_hgkeh2j
hgkeh2j
0
q7ndfx
False
False
False
2
2
33
33
1
2.85714285714286
0
0
0
0
10
28.5714285714286
35
128, 128, 128
3
Solid
50
No
233
RepliedTo
10/14/2021 3:03:50 AM
Here you [go.](https://automatetheboringstuff.com/) There is a chapter on web scraping.
Edit: I should note that it’s not a YouTube video.
hgkg8hb
webdev
Spank_Engine
t1_hgkg8hb
https://www.reddit.com/r/webdev/comments/q7ndfx/how_to_auto_populate_notes_for_cars_on_an_online/hgkg8hb/
10/14/2021 3:03:50 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hgkeh2j
t1_hgkeh2j
hgkeh2j
0
q7ndfx
False
False
False
2
1
33
33
0
0
0
0
0
0
9
37.5
24
128, 128, 128
3
Solid
50
No
669
Posted
4/19/2020 10:29:33 AM
Hi folks!
I am evaluating web scraping tools. I made a list of alternatives:
ScrapeStorm, Octoparse, Scrapestack, Apify, Web Scraper, Scrapy (ScrapingHub), Mozenda, ParseHub, Dexi, Diffbot.
Until now I already tried Scrapy, Apify and Octoparse. Do you thing that should I try the rest or Scrapy and Octoparse are the main solutions in the market?
Looking forward your replies.
Thanks in advance :)
g45iym
webscraping
AndroidePsicokiller
t3_g45iym
https://www.reddit.com/r/webscraping/comments/g45iym/evaluating_web_scraping_tools/
4/19/2020 10:29:33 AM
1/1/0001 12:00:00 AM
False
False
5
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Evaluating web scraping tools
False
1
g45iym
0
1
6
6
0
0
0
0
0
0
40
64.5161290322581
62
128, 128, 128
3
Solid
50
No
231
Commented
6/4/2020 8:21:56 AM
Also twitter have an API, could be faster and well organized.
fsu9dt5
webscraping
AndroidePsicokiller
t1_fsu9dt5
https://www.reddit.com/r/webscraping/comments/gw8v79/how_to_scrape_historial_twitter_data_with/fsu9dt5/
6/4/2020 8:21:56 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gw8v79
t3_gw8v79
gw8v79
1
gw8v79
False
False
False
0
1
6
5
2
18.1818181818182
0
0
0
0
3
27.2727272727273
11
128, 128, 128
3
Solid
50
Yes
224
RepliedTo
6/4/2020 5:34:02 AM
Did you make it?
fstxg95
webscraping
lostnfoundaround
t1_fstxg95
https://www.reddit.com/r/webscraping/comments/gw8v79/how_to_scrape_historial_twitter_data_with/fstxg95/
6/4/2020 5:34:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fstjj7w
t1_fstjj7w
fstjj7w
1
gw8v79
False
False
False
1
1
5
5
0
0
0
0
0
0
1
25
4
128, 128, 128
3
Solid
50
Yes
223
RepliedTo
6/4/2020 6:12:31 AM
Ya I followed the steps and it worked!
fsu0h8e
webscraping
Millyfang
t1_fsu0h8e
https://www.reddit.com/r/webscraping/comments/gw8v79/how_to_scrape_historial_twitter_data_with/fsu0h8e/
6/4/2020 6:12:31 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fstxg95
t1_fstxg95
fstxg95
1
gw8v79
True
False
False
2
1
5
5
1
12.5
0
0
0
0
2
25
8
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
222
RepliedTo
6/4/2020 3:56:06 PM
I wanted to scrape news website but this doesn't work. Have you tried? I am working on VBA now
fsvdtue
webscraping
tusharg19
t1_fsvdtue
https://www.reddit.com/r/webscraping/comments/gw8v79/how_to_scrape_historial_twitter_data_with/fsvdtue/
6/4/2020 3:56:06 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fsu0h8e
t1_fsu0h8e
fsu0h8e
1
gw8v79
False
False
False
3
4
5
5
1
5.26315789473684
0
0
0
0
7
36.8421052631579
19
128, 128, 128
3
Solid
50
Yes
221
RepliedTo
6/5/2020 1:24:11 AM
Which news website you are trying to scrape? Octoparse is well known for scraping news websites... I think maybe you can check out their [beginner's tutorial](https://www.octoparse.com/blog/extract-data-with-auto-detection?re=). It should take you less than a few minutes to build a news site crawler once you get the hang of it
fsxbjnc
webscraping
Millyfang
t1_fsxbjnc
https://www.reddit.com/r/webscraping/comments/gw8v79/how_to_scrape_historial_twitter_data_with/fsxbjnc/
6/5/2020 1:24:11 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fsvdtue
t1_fsvdtue
fsvdtue
2
gw8v79
True
False
False
4
1
5
5
1
1.69491525423729
1
1.69491525423729
0
0
30
50.8474576271186
59
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
220
RepliedTo
6/5/2020 2:48:13 AM
Msg you. Pls check
fsxk90o
webscraping
tusharg19
t1_fsxk90o
https://www.reddit.com/r/webscraping/comments/gw8v79/how_to_scrape_historial_twitter_data_with/fsxk90o/
6/5/2020 2:48:13 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fsxbjnc
t1_fsxbjnc
fsxbjnc
0
gw8v79
False
False
False
5
4
5
5
0
0
0
0
0
0
3
75
4
128, 128, 128
3
Solid
50
No
219
RepliedTo
6/5/2020 9:18:43 AM
You sound like a salesman!
fsyfd6l
webscraping
chevignon93
t1_fsyfd6l
https://www.reddit.com/r/webscraping/comments/gw8v79/how_to_scrape_historial_twitter_data_with/fsyfd6l/
6/5/2020 9:18:43 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fsxbjnc
t1_fsxbjnc
fsxbjnc
0
gw8v79
False
False
False
5
1
5
5
0
0
0
0
0
0
2
40
5
128, 128, 128
3
Solid
50
No
218
Posted
10/20/2020 8:48:17 AM
Hello - I'm wondering if anybody can help. I'm scraping this real estate agents listings with octoparse. It takes images from some listings but not others. CANNOT figure it out. Its either and X path issue or a timing issue. Please help!
File is here:
&#x200B;
[https://drive.google.com/file/d/1N8w223JT1F6Gzr1g9QRIexJ\_Nn-HLs\_g/view?usp=sharing](https://drive.google.com/file/d/1N8w223JT1F6Gzr1g9QRIexJ_Nn-HLs_g/view?usp=sharing)
jelnog
webscraping
botcra
t3_jelnog
https://www.reddit.com/r/webscraping/comments/jelnog/scraping_images_with_octoparse/
10/20/2020 8:48:17 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping images with Octoparse
False
1
jelnog
0
1
5
5
0
0
2
2.85714285714286
0
0
39
55.7142857142857
70
128, 128, 128
3
Solid
50
No
217
Commented
10/21/2020 12:47:59 AM
Can you introduce an artificial wait delay into your script to rule out any possibility of a race condition?
g9htlxc
webscraping
matty_fu
t1_g9htlxc
https://www.reddit.com/r/webscraping/comments/jelnog/scraping_images_with_octoparse/g9htlxc/
10/21/2020 12:47:59 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jelnog
t3_jelnog
jelnog
0
jelnog
False
False
False
0
1
5
5
0
0
1
5.26315789473684
0
0
8
42.1052631578947
19
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
216
Commented
6/19/2021 7:03:00 AM
In this case of the Octoparse tool, there are generally a number of tools available for this purpose. These tools were are usually quite expensive to find out some open-source tools or Octoparse is the best tool. Octoparse is an excellent online scraping tool, particularly for extracting data from Google Maps.
If you have it on your PC or laptop, that would be nice and easy for you to extract large chunks of data. Check out [www.octoparse.com](https://www.octoparse.com) for more features and pricing.
To extract data from Google Maps, follow these steps:
* Build a new task with the Advanced Mode by clicking “+” sign
* Input the URL into the box
* Hit “Save URL” to proceed.
Head over to Octoparse’s tutorial for a detailed view of the process and know everything with a closer eye [https://www.octoparse.com/tutorial-7/scrape-data-in-google-maps](https://www.octoparse.com/tutorial-7/scrape-data-in-google-maps)
h2aunne
learnprogramming
Interesting-Winforms
t1_h2aunne
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/h2aunne/
6/19/2021 7:03:00 AM
6/21/2021 6:45:31 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
nezi7k
t3_nezi7k
nezi7k
1
nezi7k
False
False
False
0
4
2
2
6
3.7037037037037
1
0.617283950617284
0
0
78
48.1481481481481
162
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
215
RepliedTo
6/23/2021 1:41:38 AM
This tutorial doesn't work and Support didn't help either, They are busy trying to sell there premium plans also there latest version is a CRAP with countless issues.
h2pxvk7
learnprogramming
kartikoli
t1_h2pxvk7
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/h2pxvk7/
6/23/2021 1:41:38 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h2aunne
t1_h2aunne
h2aunne
0
nezi7k
True
False
False
1
4
2
2
2
7.14285714285714
2
7.14285714285714
0
0
10
35.7142857142857
28
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
214
Commented
6/19/2021 7:03:00 AM
In this case of the Octoparse tool, there are generally a number of tools available for this purpose. These tools were are usually quite expensive to find out some open-source tools or Octoparse is the best tool. Octoparse is an excellent online scraping tool, particularly for extracting data from Google Maps.
If you have it on your PC or laptop, that would be nice and easy for you to extract large chunks of data. Check out [www.octoparse.com](https://www.octoparse.com) for more features and pricing.
To extract data from Google Maps, follow these steps:
* Build a new task with the Advanced Mode by clicking “+” sign
* Input the URL into the box
* Hit “Save URL” to proceed.
Head over to Octoparse’s tutorial for a detailed view of the process and know everything with a closer eye [https://www.octoparse.com/tutorial-7/scrape-data-in-google-maps](https://www.octoparse.com/tutorial-7/scrape-data-in-google-maps)
h2aunne
learnprogramming
Interesting-Winforms
t1_h2aunne
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/h2aunne/
6/19/2021 7:03:00 AM
6/21/2021 6:45:31 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
nezi7k
t3_nezi7k
nezi7k
1
nezi7k
False
False
False
0
4
2
2
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
213
RepliedTo
6/23/2021 1:41:38 AM
This tutorial doesn't work and Support didn't help either, They are busy trying to sell there premium plans also there latest version is a CRAP with countless issues.
h2pxvk7
learnprogramming
kartikoli
t1_h2pxvk7
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/h2pxvk7/
6/23/2021 1:41:38 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h2aunne
t1_h2aunne
h2aunne
0
nezi7k
True
False
False
1
4
2
2
128, 128, 128
3
Solid
50
No
212
Posted
12/17/2021 2:22:48 PM
Hi, I am a masters student trying to teach myself Python slowly. I am trying to make a for loop to webscrap from a list of URLs that I have collected from this website: [https://tracker.gg/valorant/leaderboards/ranked/all/default?page=1&act=4cb622e1-4244-6da3-7276-8daaf1c01be2](https://tracker.gg/valorant/leaderboards/ranked/all/default?page=1&act=4cb622e1-4244-6da3-7276-8daaf1c01be2)
Attached below is my code for the 1st page of the data I webscrapped. I am trying to build a for loop to scrape all the stats from the individual player's links such as average damage, headshot% etc. However, I am not sure how and any help would be appreciated. I am aware that programs like Octoparse exists but I would like to learn the basics if possible.
table = tracker_soup.find('table', {'class': 'trn-table'})
players = []
for row in table.find('tbody').find_all('tr'):
rank, player, ignore, rating, tier, wins = row.find_all('td')
data = {"player": player.find('span', {'class': 'trn-ign__username'}).get_text(),
"rank": rank.get_text().strip(),
"wins": wins.get_text().strip(),
"rating": rating.get_text().strip().replace(',', ''),
"link": player.find('a')['href']}
players.append(data)
df1 = pd.DataFrame(players)
##For all the links I have gathered I added the string 'https://tracker.gg'infront of the list
result['link'] ='https://tracker.gg'+result['link'].astype(str)
riihko
webscraping
childishlamino
t3_riihko
https://www.reddit.com/r/webscraping/comments/riihko/how_to_web_scrape_from_a_list_of_url_that_i_have/
12/17/2021 2:22:48 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to web scrape from a list of URL that I have collected
False
1
riihko
0
1
10
10
5
2.23214285714286
3
1.33928571428571
0
0
138
61.6071428571429
224
128, 128, 128
3
Solid
50
Yes
211
Commented
12/17/2021 2:48:20 PM
Once you have all the links in a df you can do something like this:
player_list_from_df = df1['link'].tolist()
stats_df = []
for player in player_list_from_df:
link = 'https://tracker.gg'+player
print(link)
data = requests.get(link)
print(data)
new_soup = BeautifulSoup(data.text,'html.parser')
#TODO get stats data into new_df, you can do this part!
stats_df.append(new_df)
final_stats_df = pd.concat(stats_df)
final_stats_df.to_csv('outputfile.csv',index=False)
hox8yt7
webscraping
bushcat69
t1_hox8yt7
https://www.reddit.com/r/webscraping/comments/riihko/how_to_web_scrape_from_a_list_of_url_that_i_have/hox8yt7/
12/17/2021 2:48:20 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
riihko
t3_riihko
riihko
1
riihko
False
False
False
0
1
10
10
0
0
1
1.49253731343284
0
0
45
67.1641791044776
67
128, 128, 128
3
Solid
50
Yes
210
RepliedTo
12/17/2021 2:51:12 PM
this was what I was looking for thank you!
hox9due
webscraping
childishlamino
t1_hox9due
https://www.reddit.com/r/webscraping/comments/riihko/how_to_web_scrape_from_a_list_of_url_that_i_have/hox9due/
12/17/2021 2:51:12 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hox8yt7
t1_hox8yt7
hox8yt7
0
riihko
True
False
False
1
1
10
10
1
11.1111111111111
0
0
0
0
1
11.1111111111111
9
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
209
Commented
12/17/2021 2:41:43 PM
Once you get the list of links, i suggest you look into the tracker.gg api. It will greatly simplify your work 😉
hox80x4
webscraping
SexiestBoomer
t1_hox80x4
https://www.reddit.com/r/webscraping/comments/riihko/how_to_web_scrape_from_a_list_of_url_that_i_have/hox80x4/
12/17/2021 2:41:43 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
riihko
t3_riihko
riihko
1
riihko
False
False
False
0
2
10
10
2
9.09090909090909
0
0
0
0
9
40.9090909090909
22
128, 128, 128
3
Solid
50
Yes
208
RepliedTo
12/17/2021 2:43:25 PM
I just need to create an account to access it right?
hox89ky
webscraping
childishlamino
t1_hox89ky
https://www.reddit.com/r/webscraping/comments/riihko/how_to_web_scrape_from_a_list_of_url_that_i_have/hox89ky/
12/17/2021 2:43:25 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hox80x4
t1_hox80x4
hox80x4
1
riihko
True
False
False
1
1
10
10
1
9.09090909090909
0
0
0
0
4
36.3636363636364
11
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
207
RepliedTo
12/17/2021 3:43:49 PM
Seems so
Edit: if not the case, reply to this and i will guide you for the scraping
hoxh6ez
webscraping
SexiestBoomer
t1_hoxh6ez
https://www.reddit.com/r/webscraping/comments/riihko/how_to_web_scrape_from_a_list_of_url_that_i_have/hoxh6ez/
12/17/2021 3:43:49 PM
12/17/2021 3:46:52 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hox89ky
t1_hox89ky
hox89ky
0
riihko
False
False
False
2
2
10
10
0
0
0
0
0
0
6
33.3333333333333
18
128, 128, 128
3
Solid
50
No
206
Posted
4/24/2023 6:35:21 AM
The practice of retrieving a webpage and extracting its data is referred to as web scraping. After acquiring the information, it is common to analyze, reformat, parse, or transfer it to a spreadsheet. There are various web scraping applications, but we will concentrate on a few for now, such as collecting price and product data from marketplaces. Retailers utilize this information to enhance their understanding of the market and competition.
With millions of products available on the Amazon Marketplace, it can be challenging to keep up with the latest trends and insights to improve your store’s performance. Fortunately, web scraping can be useful for extracting valuable data from Amazon’s website, including product information, pricing, customer reviews, and more.
### Why should you use web scraping?
Web scraping can help you streamline your product research process and save time that would otherwise be spent manually searching through the website. This can help you stay ahead of the competition and improve your store’s efficiency and profitability. [Web scraping](https://sstechnologyglobal.com/web-data-extraction/) can assist in increasing your business growth tenfold using web data, regardless of whether you’re a new or expanding enterprise.
**Here is how web scraping can help you improve your business:**
### Technology makes it easy:
Everyone has access to the latest technology, and there is no reason for you to hold back on utilizing every resource you can get your hands on. Access to technology is paramount when it comes to web scraping since it allows virtually anyone to scrape large amounts of data efficiently. Numerous resources are available online to help you become proficient in web scraping, and you can leverage various service providers who can assist with data collection.
### Innovation:
The potential uses of web scraping are boundless. By providing easy access to web data for everyone, web scraping sets a higher standard for innovation. It compels businesses to enhance their value proposition. Web scraping enables businesses to test and implement new ideas more swiftly, promoting faster innovation.
### Marketing automation:
Web scraping can aid in marketing automation by automating lead generation, competitor analysis, market research, and content creation. Extracting data from websites helps businesses acquire the necessary information for effective marketing campaigns and decision-making.
### Brand monitoring:
The market for brand monitoring is rapidly expanding. In today’s world, checking customer reviews before buying online has become fundamental, as consumers prefer recommendations and reassurance when making purchasing decisions.
## Top 10 Amazon Scraping Tools for 2023
**Octoparse**:
Octoparse is a web scraping tool that is free for life, enabling users to extract web data without any coding knowledge quickly. The tool stands out with its intuitive, graphic UI design and auto-detection function, eliminating the need to search for data manually. While the free plan has a 10,000-row limit, paid plans offer cloud service, scheduled automatic scraping, and IP rotation, making them useful for monitoring stock numbers, prices, and other information on a regular basis.
**SS Technology**:
SS Technology is a leading [Amazon scraping](https://sstechnologyglobal.com/amazon-data-scraping/) service provider that utilizes modern technology to scape Amazon data and help businesses make informed decisions regarding their marketing. They use various tools, scripts, and software that are crucial in the process of web data extraction, particularly when dealing with enterprise-level data.
**ScrapeStorm**:
ScrapeStorm is a visual web scraping tool that uses AI to detect data, similar to Octoparse’s auto-detection. Its browser-like UI makes it easy to use, and its Pre Login function can scrape URLs that require login. A free quota of 100 rows and one concurrent run is available, but upgrading to the professional plan with 10,000 rows per day is recommended for more data.
**Parsehub**:
ParseHub, a downloadable web scraper, is another free tool that enables users to build custom crawlers and export data into structured spreadsheets. Although it doesn’t support auto-detection or offer Amazon templates, experienced users can still utilize it for Amazon scrapping.
**Instant Data Scraper**:
The majority of the Amazon scraping tools that have been mentioned have additional and advanced functionalities apart from scraping. While they can be beneficial, they can also make the platform more complicated. For those who only require data extraction from a webpage, **Instant Data Scraper** is a simpler solution.
**Data Miner**:
Data Miner is a browser extension that can be used with Google Chrome and Microsoft Edge. It enables the extraction of data from web pages and exporting it to a CSV or Excel file. There are several custom recipes that can be used to scrape Amazon data with ease. With a user-friendly interface and basic functions, it is suitable for small businesses or casual users.
**ScraperAPI**:
This tool can assist you in scraping data from Amazon, as well as a variety of operating systems, and its AI-based web scraping tool eliminates the need to specify the desired data. Developed by expert developers, these tools are reliable and efficient.
**Webscraper.io**:
Webscraper.io is a developer tool that offers a point-and-click interface. Unlike other scrapers, it doesn’t have specific templates for Amazon or e-commerce scraping, so users must build their own crawler by selecting the information they need.
**Scraper Parsers**:
Scraper Parsers is a browser extension that can extract unstructured data and visualize it without requiring any coding. Once the data is extracted, it can be viewed on the site or downloaded in various formats such as XLSX, XLS, XML, and CSV. Moreover, the tool can create charts and display the extracted numbers in an easy-to-understand format.
**Apify**:
Apify is a powerful Amazon scraper that surpasses the limitations of the official Amazon API. t allows you to obtain price offers for a specific Amazon standard ID, search for a specific keyword, and target a specific country. With **Apify**, you can easily obtain all the necessary data you need from Amazon.
By leveraging web scraping techniques, you can obtain detailed insights about your competitors, identify popular products, track pricing trends, and gather valuable customer feedback. These insights can help you make informed decisions about your inventory, pricing strategies, and marketing campaigns, giving you a competitive edge in the marketplace.
12x63ce
u_Sure-Series8740
Sure-Series8740
t3_12x63ce
https://www.reddit.com/r/u_Sure-Series8740/comments/12x63ce/top_10_amazon_scraping_tools_2023/
4/24/2023 6:35:21 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 10 Amazon Scraping Tools 2023
False
1
12x63ce
0
1
1
1
55
5.32945736434108
5
0.484496124031008
0
0
575
55.7170542635659
1032
128, 128, 128
3
Solid
50
No
204
Commented
7/25/2022 4:16:46 AM
I'll do it for $1,000. PM me.
ihjdafs
Automate
TeleworkSolutions
t1_ihjdafs
https://www.reddit.com/r/Automate/comments/w6ystd/is_there_a_better_way_to_automate_the_scraping_of/ihjdafs/
7/25/2022 4:16:46 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
w6ystd
t3_w6ystd
w6ystd
0
w6ystd
False
False
False
0
1
34
34
0
0
0
0
0
0
2
25
8
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
203
Commented
7/24/2022 6:00:06 PM
If you can’t code what kind of solution do you expect? At a minimum, you have to at least be able to utilize a low code solution like UiPath, Automation Anywhere, Blue Prism. Even using a low code tool you still need to understand HOW to automate a web application to reliably get the data that you’re after. It’s doubtful anyone here can help you if you don’t have the minimum required to be able to even learn.
To answer your question, browser extensions are useless. First you want to analyze the HTML of the website to see how the elements are structured, so you know what data to get. Then you want to look at the networking tab of dev tools and keep an eye out for API calls, to see if you could just send a request to the server. From there you’re able to start to automate, you just have to select the best tool for the job. In python you might use selenium, or the requests library. If you’re able to just make an API call and on mac you could use curl. Otherwise you might use one of the RPA tools I mentioned earlier.
ihh3ye0
Automate
Stormkrieg
t1_ihh3ye0
https://www.reddit.com/r/Automate/comments/w6ystd/is_there_a_better_way_to_automate_the_scraping_of/ihh3ye0/
7/24/2022 6:00:06 PM
1/1/0001 12:00:00 AM
False
False
6
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
w6ystd
t3_w6ystd
w6ystd
1
w6ystd
False
False
False
0
2
34
34
2
0.975609756097561
2
0.975609756097561
0
0
83
40.4878048780488
205
128, 128, 128
3
Solid
50
Yes
202
RepliedTo
7/24/2022 6:01:03 PM
Well I was hoping someone here could write me a script that would do it for me.
ihh43fc
Automate
LordCrumpets
t1_ihh43fc
https://www.reddit.com/r/Automate/comments/w6ystd/is_there_a_better_way_to_automate_the_scraping_of/ihh43fc/
7/24/2022 6:01:03 PM
1/1/0001 12:00:00 AM
False
False
-7
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ihh3ye0
t1_ihh3ye0
ihh3ye0
2
w6ystd
True
True
comment score below threshold
False
1
1
34
34
1
5.88235294117647
0
0
0
0
4
23.5294117647059
17
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
201
RepliedTo
7/24/2022 6:05:08 PM
No one is going to write this for you for free. If you’re hiring post it on a forhire sub, or on fiverr. Automation projects are all custom based on very specific requirements, there’s usually not a “out-of-the-box” solution waiting.
ihh4on4
Automate
Stormkrieg
t1_ihh4on4
https://www.reddit.com/r/Automate/comments/w6ystd/is_there_a_better_way_to_automate_the_scraping_of/ihh4on4/
7/24/2022 6:05:08 PM
1/1/0001 12:00:00 AM
False
False
5
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ihh43fc
t1_ihh43fc
ihh43fc
0
w6ystd
False
False
False
2
2
34
34
1
2.22222222222222
0
0
0
0
19
42.2222222222222
45
128, 128, 128
3
Solid
50
Yes
200
RepliedTo
7/25/2022 1:01:15 AM
Sure, for two grand.
ihipxqz
Automate
Geminii27
t1_ihipxqz
https://www.reddit.com/r/Automate/comments/w6ystd/is_there_a_better_way_to_automate_the_scraping_of/ihipxqz/
7/25/2022 1:01:15 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ihh43fc
t1_ihh43fc
ihh43fc
1
w6ystd
False
False
False
2
1
34
34
1
25
0
0
0
0
2
50
4
128, 128, 128
3
Solid
50
Yes
199
RepliedTo
7/25/2022 6:54:50 AM
I’m willing to pay. Not sure about two grand though.
ihjrgj1
Automate
LordCrumpets
t1_ihjrgj1
https://www.reddit.com/r/Automate/comments/w6ystd/is_there_a_better_way_to_automate_the_scraping_of/ihjrgj1/
7/25/2022 6:54:50 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ihipxqz
t1_ihipxqz
ihipxqz
0
w6ystd
True
False
False
3
1
34
34
2
18.1818181818182
0
0
0
0
3
27.2727272727273
11
128, 128, 128
3
Solid
50
No
205
Posted
7/24/2022 3:55:35 PM
So I need something that can automatically maybe once a day go to a website (let's just say it lists something for sale, like bikes), scrape the available ads, then notify me if/when there are new ones.
What's the best way to do that? I've tried lots of browser extensions and a program called Octoparse but nothing works exactly how I want it. I can't code so no idea how to go that route, but I imagine it would work best with some sort of coded script rather than an app or program?
Any ideas?
w6ystd
Automate
LordCrumpets
t3_w6ystd
https://www.reddit.com/r/Automate/comments/w6ystd/is_there_a_better_way_to_automate_the_scraping_of/
7/24/2022 3:55:35 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Is there a better way to automate the scraping of website data and alerting me to new entries?
False
0.47
w6ystd
0
1
34
34
5
5.20833333333333
0
0
0
0
38
39.5833333333333
96
128, 128, 128
3
Solid
50
No
198
Posted
12/30/2022 3:23:42 PM
Hi. I am new to programming and scripting. Learning now, but I would really need some help right now for a project if somebody would be nice to help me out. When using the Octaporase, it is pretty straight forward with extracting data that is visible on the page. Put when you want to extract data that is hidden, you need to know a bit more scripting I guess.
I would like to extract the sku number from this site [https://www.nespresso.com/se/se/order/capsules/vertuo](https://www.nespresso.com/se/se/order/capsules/vertuo) but I don´t know how.
The SKU number is not visible on the website but is in the background e.g( Nb-sku-coffee id="109877" where id is the product number) The Sku number is the ID of all product.
Could someone help me and explain how to extract this data that is not on the front page.
Best regards
zz33l7
webscraping
Snellfarfar
t3_zz33l7
https://www.reddit.com/r/webscraping/comments/zz33l7/how_to_scrape_data_that_is_not_visable_on_the/
12/30/2022 3:23:42 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape data that is not visable on the site with Octoparse
False
1
zz33l7
0
1
16
16
4
2.5
0
0
0
0
65
40.625
160
128, 128, 128
3
Solid
50
Yes
197
Commented
12/30/2022 9:44:33 PM
Hi, considering that you have the whole html data from octaparse you can apply the current code to get the sku. All you need to do is change the req.text.
https://pastebin.com/2vc8JzKz
j2ar2dr
webscraping
Elitedoorhugger
t1_j2ar2dr
https://www.reddit.com/r/webscraping/comments/zz33l7/how_to_scrape_data_that_is_not_visable_on_the/j2ar2dr/
12/30/2022 9:44:33 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zz33l7
t3_zz33l7
zz33l7
1
zz33l7
False
False
False
0
1
16
16
0
0
0
0
0
0
14
45.1612903225806
31
128, 128, 128
3
Solid
50
Yes
196
RepliedTo
1/8/2023 4:09:22 PM
Thank you. But could not this be done by using xpath?
j3hbmhi
webscraping
Snellfarfar
t1_j3hbmhi
https://www.reddit.com/r/webscraping/comments/zz33l7/how_to_scrape_data_that_is_not_visable_on_the/j3hbmhi/
1/8/2023 4:09:22 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j2ar2dr
t1_j2ar2dr
j2ar2dr
0
zz33l7
True
False
False
1
1
16
16
1
9.09090909090909
0
0
0
0
3
27.2727272727273
11
128, 128, 128
3
Solid
50
Yes
193
Commented
2/19/2020 5:47:41 PM
You have to scrape it. Not offered via any api.
I've got everything except aggregating it by dates. Had someone ask for this exact this last week, so we'll see whether it's worth implementing.
fi45nam
FulfillmentByAmazon
oldschoolvalue
t1_fi45nam
https://www.reddit.com/r/FulfillmentByAmazon/comments/f5cy2g/data_mining_your_item_reviews_daily_what_is_the/fi45nam/
2/19/2020 5:47:41 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
f5cy2g
t3_f5cy2g
f5cy2g
1
f5cy2g
False
False
False
0
1
20
20
1
2.94117647058824
0
0
0
0
15
44.1176470588235
34
128, 128, 128
3
Solid
50
Yes
192
RepliedTo
2/19/2020 5:58:35 PM
yeah im gonna do this with octoparse i guess or simple scraper.
any recommendations on scrapers that you recommend?
fi46sfo
FulfillmentByAmazon
rawrtherapy
t1_fi46sfo
https://www.reddit.com/r/FulfillmentByAmazon/comments/f5cy2g/data_mining_your_item_reviews_daily_what_is_the/fi46sfo/
2/19/2020 5:58:35 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fi45nam
t1_fi45nam
fi45nam
0
f5cy2g
True
False
False
1
1
20
20
2
10.5263157894737
0
0
0
0
7
36.8421052631579
19
128, 128, 128
3
Solid
50
No
191
Commented
2/17/2020 9:15:13 PM
Performance > Brand Dashboard > Customer Reviews and then choose time period? Record them daily and it will always only be reviews from the previous day.
fhye64a
FulfillmentByAmazon
jordanwilson23
t1_fhye64a
https://www.reddit.com/r/FulfillmentByAmazon/comments/f5cy2g/data_mining_your_item_reviews_daily_what_is_the/fhye64a/
2/17/2020 9:15:13 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
f5cy2g
t3_f5cy2g
f5cy2g
0
f5cy2g
False
False
False
0
1
20
20
0
0
0
0
0
0
14
58.3333333333333
24
128, 128, 128
3
Solid
50
Yes
190
Commented
2/17/2020 7:46:30 PM
The reason this tool doesn't exist is because there is no way of getting that data within TOS.
fhy55el
FulfillmentByAmazon
resoluter08
t1_fhy55el
https://www.reddit.com/r/FulfillmentByAmazon/comments/f5cy2g/data_mining_your_item_reviews_daily_what_is_the/fhy55el/
2/17/2020 7:46:30 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
f5cy2g
t3_f5cy2g
f5cy2g
1
f5cy2g
False
False
False
0
1
20
20
0
0
0
0
0
0
8
44.4444444444444
18
128, 128, 128
3
Solid
50
Yes
189
RepliedTo
2/17/2020 8:03:09 PM
So i cant data mine it?
I mean its public information
why would it be against TOS? that makes no sense
fhy6t3i
FulfillmentByAmazon
rawrtherapy
t1_fhy6t3i
https://www.reddit.com/r/FulfillmentByAmazon/comments/f5cy2g/data_mining_your_item_reviews_daily_what_is_the/fhy6t3i/
2/17/2020 8:03:09 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fhy55el
t1_fhy55el
fhy55el
0
f5cy2g
True
False
False
1
1
20
20
0
0
0
0
0
0
10
47.6190476190476
21
128, 128, 128
3
Solid
50
Yes
188
Commented
2/17/2020 6:44:16 PM
You could setup a scraper to pull the data and pushed into a database. We setup something like this.
Interested in getting a tool built?
fhxyqxj
FulfillmentByAmazon
vrjain
t1_fhxyqxj
https://www.reddit.com/r/FulfillmentByAmazon/comments/f5cy2g/data_mining_your_item_reviews_daily_what_is_the/fhxyqxj/
2/17/2020 6:44:16 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
f5cy2g
t3_f5cy2g
f5cy2g
1
f5cy2g
False
False
False
0
1
20
20
0
0
0
0
0
0
12
48
25
128, 128, 128
3
Solid
50
Yes
187
RepliedTo
2/17/2020 6:47:44 PM
yeah actually i would be interested
fhxz4pb
FulfillmentByAmazon
rawrtherapy
t1_fhxz4pb
https://www.reddit.com/r/FulfillmentByAmazon/comments/f5cy2g/data_mining_your_item_reviews_daily_what_is_the/fhxz4pb/
2/17/2020 6:47:44 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fhxyqxj
t1_fhxyqxj
fhxyqxj
0
f5cy2g
True
False
False
1
1
20
20
0
0
0
0
0
0
3
50
6
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
195
Posted
2/17/2020 6:16:43 PM
I would really like to have a software or data mine my items so that i can track how many reviews are left by day so i can compare week by week, month by month, etc
Any way to get this done?
I would like to just get my day to day review numbers and compare with other dates
any reputable software that can do this?
If no software is available can i just Octoparse to data mine or simplescraper?
f5cy2g
FulfillmentByAmazon
rawrtherapy
t3_f5cy2g
https://www.reddit.com/r/FulfillmentByAmazon/comments/f5cy2g/data_mining_your_item_reviews_daily_what_is_the/
2/17/2020 6:16:43 PM
1/1/0001 12:00:00 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Data Mining your item reviews daily, what is the best way to do this?
False
0.76
f5cy2g
0
4
20
20
2
2.5
0
0
0
0
30
37.5
80
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
194
Posted
1/13/2020 10:29:01 PM
For some reason on Octoparse when i try scraping data from the Shipping Queue
I add my URL and it keeps asking me to sign in
Any of you guys know how to override this where i dont have to sign in and it will just collect the data from that page?
Already asked Octoparse and just waiting on their email, wanted to ask here to see if anyone had any insight on this
eobkmj
FulfillmentByAmazon
rawrtherapy
t3_eobkmj
https://www.reddit.com/r/FulfillmentByAmazon/comments/eobkmj/any_of_you_guys_use_octoparse_to_scrape_data_from/
1/13/2020 10:29:01 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Any of you guys use Octoparse to scrape data from Amazon? Need some help
False
1
eobkmj
0
4
20
20
0
0
0
0
0
0
29
39.1891891891892
74
128, 128, 128
3
Solid
50
No
186
Posted
9/28/2020 7:06:50 PM
I tried using selenium at first, but it's hard because the phone, the email, etc have the same class name and infact, every attribute value is the same
I tried a tool like octoparse, but turns out, it just wants me to pay
And I'd be forever ashamed of myself if I paid for basically a selenium bot
I want to continue using selenium because I get to paginate and it's all the flexibility I want
How would you do it?
j1ihwf
learnpython
Rahul_Desai1999
t3_j1ihwf
https://www.reddit.com/r/learnpython/comments/j1ihwf/how_would_you_scrape_google_maps/
9/28/2020 7:06:50 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How would you scrape google maps?
False
0.4
j1ihwf
0
1
21
21
1
1.23456790123457
2
2.46913580246914
0
0
29
35.8024691358025
81
128, 128, 128
3
Solid
50
No
185
Commented
9/28/2020 7:13:24 PM
Have you looked at the API? Probably a better tool then Selenium
g6zfcn8
learnpython
eccepiscinam
t1_g6zfcn8
https://www.reddit.com/r/learnpython/comments/j1ihwf/how_would_you_scrape_google_maps/g6zfcn8/
9/28/2020 7:13:24 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j1ihwf
t3_j1ihwf
j1ihwf
1
j1ihwf
False
False
False
0
1
21
21
1
8.33333333333333
0
0
0
0
5
41.6666666666667
12
128, 128, 128
3
Solid
50
Yes
184
RepliedTo
9/28/2020 7:16:16 PM
yes but very pricey, well worth it for web-apps.
g6zfqvm
learnpython
Omar_88
t1_g6zfqvm
https://www.reddit.com/r/learnpython/comments/j1ihwf/how_would_you_scrape_google_maps/g6zfqvm/
9/28/2020 7:16:16 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g6zfcn8
t1_g6zfcn8
g6zfcn8
2
j1ihwf
False
False
False
1
1
21
21
2
20
1
10
0
0
4
40
10
128, 128, 128
3
Solid
50
Yes
183
RepliedTo
9/28/2020 7:19:19 PM
didn't realize Google charges for APIs
g6zg6cj
learnpython
eccepiscinam
t1_g6zg6cj
https://www.reddit.com/r/learnpython/comments/j1ihwf/how_would_you_scrape_google_maps/g6zg6cj/
9/28/2020 7:19:19 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g6zfqvm
t1_g6zfqvm
g6zfqvm
1
j1ihwf
False
False
False
2
1
21
21
0
0
0
0
0
0
4
66.6666666666667
6
128, 128, 128
3
Solid
50
No
182
RepliedTo
9/28/2020 7:25:58 PM
They dont charge unless you go over the credit limit they provide. Why dont you scrape openstreetmaps? Actually download the OSM file of your country, and scrape it.
Im working on grabbing all the parks in the metro area for my pokemon go telegram bot for local community.
g6zh4vf
learnpython
lolslim
t1_g6zh4vf
https://www.reddit.com/r/learnpython/comments/j1ihwf/how_would_you_scrape_google_maps/g6zh4vf/
9/28/2020 7:25:58 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g6zfqvm
t1_g6zfqvm
g6zfqvm
0
j1ihwf
False
False
False
2
1
21
21
0
0
1
2.08333333333333
0
0
27
56.25
48
128, 128, 128
3
Solid
50
No
181
RepliedTo
9/28/2020 7:51:17 PM
Depends on the API and how much you use it.
g6zkr1h
learnpython
kramrm
t1_g6zkr1h
https://www.reddit.com/r/learnpython/comments/j1ihwf/how_would_you_scrape_google_maps/g6zkr1h/
9/28/2020 7:51:17 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
g6zg6cj
t1_g6zg6cj
g6zg6cj
0
j1ihwf
False
False
False
3
1
21
21
0
0
0
0
0
0
4
40
10
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
179
Commented
12/5/2022 6:07:52 PM
Is this effective for B2B outreach? I have a couple of friends with small businesses that could use some help.
iz11wsp
sales
OutlandishnessOk153
t1_iz11wsp
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iz11wsp/
12/5/2022 6:07:52 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zcvmna
t3_zcvmna
zcvmna
1
zcvmna
False
False
False
0
5
22
22
1
5
0
0
0
0
8
40
20
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
178
RepliedTo
12/5/2022 6:17:40 PM
Yeah this is actually ideally meant for B2B - that's what I primarily used these tools for.
Lex Leads has a good feature where you can sort by type of company, location of company including state/country/zip code and number of employees. That's something that outscraper and octoparse doesn't do.
Like for example, let's say yout friend is looking to sell to pizzerias that have 1-10 employees. Lex Leads gives you an option to specify that.
Or maybe your friend is in biotech and he's looking to get Biotech leads. Same process just a different industry.
They will basically give you a downloadable excel spreadsheet but you also can see the leads in the Lex Leads app as well. .
Zoominfo is still the leader in this space but then again the cost is huge. Lex Leads suite charges 99 USD a month for unlimited leads but if youre a startup that is not bad.
But yeah for long term lead generation I'd opt for Lex Leads as opposed to Outscraper because of the flat cost. Hope that helps.
iz13ffv
sales
Same_Paint6431
t1_iz13ffv
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iz13ffv/
12/5/2022 6:17:40 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iz11wsp
t1_iz11wsp
iz11wsp
1
zcvmna
True
False
False
1
4
22
22
14
7.82122905027933
1
0.558659217877095
0
0
80
44.6927374301676
179
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
177
RepliedTo
12/5/2022 6:26:21 PM
Are your primary channels for outreach LinkedIn? I'm currently in sales B2B and wondering how to up my game. Utilizing email, phone, and LinkedIn. Need to automate this honestly.
iz14s01
sales
OutlandishnessOk153
t1_iz14s01
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iz14s01/
12/5/2022 6:26:21 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iz13ffv
t1_iz13ffv
iz13ffv
1
zcvmna
False
False
False
2
5
22
22
0
0
0
0
0
0
16
55.1724137931034
29
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
176
RepliedTo
12/5/2022 6:37:47 PM
Good question! Different industries work better with different approaches.
For example, cold calling lawyers is really difficult primarily because of the gate keepers in that industry.
In that scenario I would opt for a LinkedIn approach with a brief canned message stating how your service can help their law firm and a request for a meeting.
Lex Leads will automate the message sending so you'll have hundreds of messages sent depending on how long you run the automation for.
Just beware if you let it run for too long you may get overwhelmed with the responses haha.
Then you pretty much get replies and direct the conversation to a meeting.
So in that scenario of selling to lawyers it's LinkedIn THEN phone contact.
But let's say you're selling to real estate agents - in that scenario I would resort to phone contact first.
Or to give another example let's say your selling to doctors - this would be a cold call approach in my opinion.
Or maybe you're selling a VPN software suite to enterprises? You probably want to cold call first to get access to decision makers.
Hope that makes sense.
iz16jvk
sales
Same_Paint6431
t1_iz16jvk
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iz16jvk/
12/5/2022 6:37:47 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iz14s01
t1_iz14s01
iz14s01
1
zcvmna
True
False
False
3
4
22
22
5
2.63157894736842
6
3.15789473684211
0
0
86
45.2631578947368
190
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
175
RepliedTo
12/5/2022 6:56:20 PM
Yes. I just DM'd asking for private meeting. Please let me know if we could arrange this.
iz19f31
sales
OutlandishnessOk153
t1_iz19f31
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iz19f31/
12/5/2022 6:56:20 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iz16jvk
t1_iz16jvk
iz16jvk
0
zcvmna
False
False
False
4
5
22
22
0
0
0
0
0
0
8
47.0588235294118
17
128, 128, 128
3
Solid
50
No
174
Commented
12/5/2022 3:58:29 PM
What does your company do?
iz0i3xx
sales
bfh956
t1_iz0i3xx
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iz0i3xx/
12/5/2022 3:58:29 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zcvmna
t3_zcvmna
zcvmna
0
zcvmna
False
False
False
0
1
22
22
0
0
0
0
0
0
1
20
5
128, 128, 128
3
Solid
50
No
180
Posted
12/5/2022 3:29:53 AM
Hey guys, so I've been working on building my company and pretty much have the lead generation process down. I just wanted to share some tips on how I source leads so I can start selling my services.
I mean if you have no one to cold-call how will you make sales, right? You need some tools to get this job done.
Pretty much all of the tools I've used for lead generation come down to these three:
**1)** [Octoparse](https://www.octoparse.com)
I've used this to automate lead extraction from sites that have pagination with a whole list of people. For example: Lawyers, Real Estate Agents, Financial Advisors and so on. Octoparse is a bit difficult to learn if you're not tech savvy but once you learn it the amount of leads you can capture is huge.
They have a free and paid version - I just used the free one.
**2)** [Lex Leads](https://www.lexleads.com/)
Lex Leads is pretty much a subscription-based SAAS service that lets you do a host of things. The main things I've used it for are B2B lead capture and LinkedIn prospecting. For B2B lead capture it feeds leads from a huge database with different industries. The LinkedIn prospecting tool lets you extract lead information such as Phone Numbers, Emails, Company Role, etc
It is paid but they have a 7 day free trial. The paid version gives you unlimited leads.
**3)** [Outscraper](https://outscraper.com)
Outscraper is something I've used a lot for getting leads from google maps. It's pretty cheap. For example you can get 10,000 google map leads for like 80$ USD. I've used this to get some leads to call for my business. I think they have a promotion where you can get free credits every month. It scrapes leads pretty fast, just takes a few hours to get a
But yeah it's good if you're a BDR or SDR etc who doesn't want to go through every google map lead manually.
So yeah I'd recommend these tools if you're struggling to get leads - even if it's paid you'll at least get many many times the value in return.
But my process basically would involve calling them, qualifying the leads then notating everything in the CRM for the next steps. :) I use Close CRM - because of the built-in dialer.
When you call those leads don't mark them off unless they specifically are unqualified or explicitly say they want you to not call them. It takes 8 touchpoints to sell because people buy off of familiarity.
So often the sale can also be in the follow up... as you may know. So don't let these leads go to waste. :) Start converting them.
Hope you guys find this helpful.
zcvmna
sales
Same_Paint6431
t3_zcvmna
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/
12/5/2022 3:29:53 AM
1/1/0001 12:00:00 AM
False
False
43
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How I Find Leads - My Process & Tools
False
0.97
zcvmna
0
1
22
22
37
7.99136069114471
6
1.29589632829374
0
0
185
39.9568034557235
463
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
173
Commented
12/5/2022 7:02:22 AM
These are cool. thanks for sharing. if you wanna automate your linkedin outreach, let me know :)
iyz5ww7
sales
LazyLeadz
t1_iyz5ww7
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iyz5ww7/
12/5/2022 7:02:22 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zcvmna
t3_zcvmna
zcvmna
1
zcvmna
False
False
False
0
2
22
22
1
6.25
0
0
0
0
7
43.75
16
128, 128, 128
3
Solid
50
Yes
172
RepliedTo
12/5/2022 7:06:41 AM
I use Lex Leads to do that. It automates LinkedIn messaging & connection requests.
Although I'm thinking of hiring a BDR to do my outreach for me.
iyz68t9
sales
Same_Paint6431
t1_iyz68t9
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iyz68t9/
12/5/2022 7:06:41 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iyz5ww7
t1_iyz5ww7
iyz5ww7
1
zcvmna
True
False
False
1
1
22
22
1
3.84615384615385
0
0
0
0
12
46.1538461538462
26
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
171
RepliedTo
12/5/2022 7:14:09 AM
How many meeting per month are you generating via linkedin?
iyz6t47
sales
LazyLeadz
t1_iyz6t47
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iyz6t47/
12/5/2022 7:14:09 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iyz68t9
t1_iyz68t9
iyz68t9
1
zcvmna
False
False
False
2
2
22
22
0
0
0
0
0
0
5
50
10
128, 128, 128
3
Solid
50
Yes
170
RepliedTo
12/5/2022 11:15:37 AM
Id also be interested to know this if you don't mind :)
iyzn4ir
sales
MrBadApple2022
t1_iyzn4ir
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iyzn4ir/
12/5/2022 11:15:37 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iyz6t47
t1_iyz6t47
iyz6t47
1
zcvmna
False
False
False
3
1
22
22
0
0
0
0
0
0
4
36.3636363636364
11
128, 128, 128
3
Solid
50
Yes
169
RepliedTo
12/5/2022 5:17:28 PM
I’m not OP but my clients average ~6 per month with qualified prospects
iz0txha
sales
LazyLeadz
t1_iz0txha
https://www.reddit.com/r/sales/comments/zcvmna/how_i_find_leads_my_process_tools/iz0txha/
12/5/2022 5:17:28 PM
1/1/0001 12:00:00 AM
False
False
-2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iyzn4ir
t1_iyzn4ir
iyzn4ir
0
zcvmna
False
False
False
4
1
22
22
1
7.14285714285714
0
0
0
0
4
28.5714285714286
14
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
166
Commented
11/11/2021 10:36:23 AM
We use Phantombuster just because but you can check out Texau, it has an option of running on your desktop. Those two tools regularly come up in the growth hacking communities and gosh these people are really forward-looking with the tools. Nothing beats the classics!
hk6rnuw
sales
Sensitive_Purchase71
t1_hk6rnuw
https://www.reddit.com/r/sales/comments/qr55uu/best_lead_scraper_for_linkedin/hk6rnuw/
11/11/2021 10:36:23 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qr55uu
t3_qr55uu
qr55uu
0
qr55uu
False
False
False
0
4
45
45
0
0
0
0
0
0
24
52.1739130434783
46
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
165
Commented
11/11/2021 10:36:23 AM
We use Phantombuster just because but you can check out Texau, it has an option of running on your desktop. Those two tools regularly come up in the growth hacking communities and gosh these people are really forward-looking with the tools. Nothing beats the classics!
hk6rnuw
sales
Sensitive_Purchase71
t1_hk6rnuw
https://www.reddit.com/r/sales/comments/qr55uu/best_lead_scraper_for_linkedin/hk6rnuw/
11/11/2021 10:36:23 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qr55uu
t3_qr55uu
qr55uu
0
qr55uu
False
False
False
0
4
45
45
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
168
Posted
11/10/2021 9:51:22 PM
I just joined a company as part of Marketing, but am working very closely with sales. I need to generate top of funnel contact lists that I can then send paid ads, which in turn will bring in leads for Sales to call.
There are dozens and dozens (hundreds?) of scrapers out there and I really don't have time to research and test them all.
I'm looking for a powerful, user-friendly scraper that can automatically give me tens of thousands of contact information based on Linkedin searches. Furthermore, I'd like this scraper platform to be able to create and automate the workflow of connecting with these people on Linkedin and then sending them direct messages.
Bonus Question - Is Sales Navigator necessary for this? I only have Linkedin Basic, but can get access to Sales Nav if necessary. All of these scraper tutorials mention Sales Nav.
What scraping tool would you recommend and why? I checked out Octoparse but seems like a steep learning curve. I'm exploring Phantom Buster now and like what I see.....thoughts on Phantom Buster or any other platforms I should check out?
Thanks.
qr55uu
sales
hellletloose94
t3_qr55uu
https://www.reddit.com/r/sales/comments/qr55uu/best_lead_scraper_for_linkedin/
11/10/2021 9:51:22 PM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Best Lead Scraper for Linkedin?
False
0.67
qr55uu
0
4
45
45
6
3.19148936170213
1
0.531914893617021
0
0
84
44.6808510638298
188
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
167
Posted
11/10/2021 9:51:22 PM
I just joined a company as part of Marketing, but am working very closely with sales. I need to generate top of funnel contact lists that I can then send paid ads, which in turn will bring in leads for Sales to call.
There are dozens and dozens (hundreds?) of scrapers out there and I really don't have time to research and test them all.
I'm looking for a powerful, user-friendly scraper that can automatically give me tens of thousands of contact information based on Linkedin searches. Furthermore, I'd like this scraper platform to be able to create and automate the workflow of connecting with these people on Linkedin and then sending them direct messages.
Bonus Question - Is Sales Navigator necessary for this? I only have Linkedin Basic, but can get access to Sales Nav if necessary. All of these scraper tutorials mention Sales Nav.
What scraping tool would you recommend and why? I checked out Octoparse but seems like a steep learning curve. I'm exploring Phantom Buster now and like what I see.....thoughts on Phantom Buster or any other platforms I should check out?
Thanks.
qr55uu
sales
hellletloose94
t3_qr55uu
https://www.reddit.com/r/sales/comments/qr55uu/best_lead_scraper_for_linkedin/
11/10/2021 9:51:22 PM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Best Lead Scraper for Linkedin?
False
0.67
qr55uu
0
4
45
45
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
164
Commented
11/11/2021 8:45:24 AM
https://www.linkedhelper.com/ but you will need a paid access to linkedin to as this tool extract only that what it see
hk6k1ur
sales
Global_Divide2795
t1_hk6k1ur
https://www.reddit.com/r/sales/comments/qr55uu/best_lead_scraper_for_linkedin/hk6k1ur/
11/11/2021 8:45:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qr55uu
t3_qr55uu
qr55uu
0
qr55uu
False
False
False
0
4
45
45
0
0
0
0
0
0
7
36.8421052631579
19
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
163
Commented
11/11/2021 8:45:24 AM
https://www.linkedhelper.com/ but you will need a paid access to linkedin to as this tool extract only that what it see
hk6k1ur
sales
Global_Divide2795
t1_hk6k1ur
https://www.reddit.com/r/sales/comments/qr55uu/best_lead_scraper_for_linkedin/hk6k1ur/
11/11/2021 8:45:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qr55uu
t3_qr55uu
qr55uu
0
qr55uu
False
False
False
0
4
45
45
128, 128, 128
3
Solid
50
Yes
729
Commented
11/1/2020 5:29:51 PM
<tr>
<td><i class="rr-icon rr-icon-revenue"></i><strong>Revenue</strong></td>
<td>$1.00 Million</td>
</tr>
The value you want is in the next `<td>` tag that comes after the match.
>>> doc.xpath('//strong[text() = "Revenue"]/following::td[1]/text()')
['$1.00 Million']
`following::td` selects all `<td>` tags that come after the match - `[1]` gets the first - `text()` gives you the text content of the tag.
//strong[text() = "Revenue"]/following::td[1]/text()
gat5pbv
learnprogramming
commandlineluser
t1_gat5pbv
https://www.reddit.com/r/learnprogramming/comments/jm4oc1/help_with_xpath/gat5pbv/
11/1/2020 5:29:51 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jm4oc1
t3_jm4oc1
jm4oc1
1
jm4oc1
False
False
False
0
1
12
2
4
4.93827160493827
0
0
0
0
52
64.1975308641975
81
128, 128, 128
3
Solid
50
Yes
728
RepliedTo
11/2/2020 3:35:45 AM
>//strong\[text() = "Revenue"\]/following::td\[1\]/text()
Thanks a lot, This looks great and should help gathering details for other datapoints. Though something changed with Rocketreach site as it redirect automatically to there search page even when logged in so its impossible to scrape unless we stay on the page. I'll look for alternate mathod but anyways thanks.
gav3fcc
learnprogramming
kartikoli
t1_gav3fcc
https://www.reddit.com/r/learnprogramming/comments/jm4oc1/help_with_xpath/gav3fcc/
11/2/2020 3:35:45 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gat5pbv
t1_gat5pbv
gat5pbv
0
jm4oc1
True
False
False
1
1
2
12
2
3.44827586206897
1
1.72413793103448
0
0
28
48.2758620689655
58
128, 128, 128
3
Solid
50
No
157
Commented
5/30/2019 5:11:19 PM
I don't think Amazon will like (or allow) you to scrape them on a large scale.
If you have an example page and the output you need to generate I could write you up a "tutorial" on how to do it.
epi9s3z
learnpython
commandlineluser
t1_epi9s3z
https://www.reddit.com/r/learnpython/comments/butj5e/amazon_scrape_trying_with_octoparse/epi9s3z/
5/30/2019 5:11:19 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
butj5e
t3_butj5e
butj5e
1
butj5e
False
False
False
0
1
12
12
0
0
0
0
0
0
12
29.2682926829268
41
128, 128, 128
3
Solid
50
No
156
RepliedTo
8/11/2019 2:30:48 AM
I would love if you can help me with a tuto for the 999 card trick to know the ammount of stock for a product.
ewkp9uc
learnpython
Yassiiir
t1_ewkp9uc
https://www.reddit.com/r/learnpython/comments/butj5e/amazon_scrape_trying_with_octoparse/ewkp9uc/
8/11/2019 2:30:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
epi9s3z
t1_epi9s3z
epi9s3z
0
butj5e
False
False
False
1
1
12
12
1
4
1
4
0
0
8
32
25
128, 128, 128
3
Solid
50
No
159
Posted
5/30/2019 2:01:21 PM
Hi -
I have a (paid) project which requires me to scrape individual pages from Amazon. I had been using [Import.io](https://Import.io), but I wasn't able to afford their monthly fee.
I've been trying to teach myself Python and have paid for a few courses, but my brain doesn't seem to be wired that way I've been trying for six months, I am very appreciative of the time investment most people make and realize I'm nowhere close to that! I was able to create a few codes within the actual program (simple ones they teach, eg health potions, etc.) - the thing I can't figure out is how in the world to apply that to real life situations like searching the web for something.
&#x200B;
As an alternative, I have been trying to work with Octoparse. I am able to follow the tutorials and get results, but any deviation causes me grief.
&#x200B;
I have two questions:
1 - Is there a group you can recommend where I can look for Octoparse assistance? Their troubleshooting isn't helping me, and I'd love to have someone point out where I'm going wrong. I don't know if this is something I should pay for?
2. - Is there a link you can point me to that breaks down how to create a Python code and apply it to a real life situation? I attempted to follow a few, but they lost me after the first couple of steps.
&#x200B;
Please assume that I know absolutely nothing about anything in life when you respond. This is an EILI5 situation.
THANK YOU in advance!
butj5e
learnpython
ScrapeLoser
t3_butj5e
https://www.reddit.com/r/learnpython/comments/butj5e/amazon_scrape_trying_with_octoparse/
5/30/2019 2:01:21 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Amazon Scrape (trying with Octoparse)
False
1
butj5e
0
1
12
12
7
2.5830258302583
5
1.8450184501845
0
0
107
39.4833948339483
271
128, 128, 128
3
Solid
50
No
158
Commented
8/27/2019 9:01:12 AM
I have been using [Octoparse](https://www.youtube.com/watch?v=VOoPev_GzUM&t=13s)for a long time, they have a great support team which you ask for assistance from a troop of a data expert.
ey7r4r9
learnpython
Octoparse
t1_ey7r4r9
https://www.reddit.com/r/learnpython/comments/butj5e/amazon_scrape_trying_with_octoparse/ey7r4r9/
8/27/2019 9:01:12 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
butj5e
t3_butj5e
butj5e
0
butj5e
False
False
False
0
1
12
12
2
5.55555555555556
0
0
0
0
14
38.8888888888889
36
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
155
Commented
5/30/2019 2:18:46 PM
> Is there a link you can point me to that breaks down how to create a Python code and apply it to a real life situation? I attempted to follow a few, but they lost me after the first couple of steps.
maybe look at the book automate the boring stuff? everything in it is real life examples and for beginners
ephf1lz
learnpython
ScreamingIsMyAir
t1_ephf1lz
https://www.reddit.com/r/learnpython/comments/butj5e/amazon_scrape_trying_with_octoparse/ephf1lz/
5/30/2019 2:18:46 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
butj5e
t3_butj5e
butj5e
1
butj5e
False
False
False
0
5
12
12
0
0
3
4.91803278688525
0
0
24
39.344262295082
61
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
154
RepliedTo
5/30/2019 2:53:13 PM
>automate the boring stuff
That's the Udemy course I took! It may be time for me to pursue a different path... He was really helpful, but I still wasn't able to understand it.
ephkyr2
learnpython
ScrapeLoser
t1_ephkyr2
https://www.reddit.com/r/learnpython/comments/butj5e/amazon_scrape_trying_with_octoparse/ephkyr2/
5/30/2019 2:53:13 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ephf1lz
t1_ephf1lz
ephf1lz
1
butj5e
True
False
False
1
4
12
12
1
3.03030303030303
1
3.03030303030303
0
0
12
36.3636363636364
33
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
153
RepliedTo
5/30/2019 2:55:58 PM
You said you've been trying to learn for 6 months...but how many days a week, how many hours? 6 months isn't a lot of time if its a few hours a week. Thinking like a programmer takes a very long time; how long did it take to learn English growing up? I'm certain you weren't speaking sentences (probably not even single words!) by 6 months, let alone writing words.
ephlfbq
learnpython
ScreamingIsMyAir
t1_ephlfbq
https://www.reddit.com/r/learnpython/comments/butj5e/amazon_scrape_trying_with_octoparse/ephlfbq/
5/30/2019 2:55:58 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ephkyr2
t1_ephkyr2
ephkyr2
1
butj5e
False
False
False
2
5
12
12
0
0
0
0
0
0
35
50
70
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
152
RepliedTo
5/30/2019 5:11:20 PM
Absolutely, I completely agree. I comb through these forums a lot, and I know how frustrated serious programmers can get with people who don't invest time in learning something.
I do not know how to translate the things I DO learn into real life applications. There is a disconnect for me somewhere. I don't expect to learn much in such a short time, but I would hope to retain something.
epi9s5z
learnpython
ScrapeLoser
t1_epi9s5z
https://www.reddit.com/r/learnpython/comments/butj5e/amazon_scrape_trying_with_octoparse/epi9s5z/
5/30/2019 5:11:20 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ephlfbq
t1_ephlfbq
ephlfbq
1
butj5e
True
False
False
3
4
12
12
0
0
1
1.42857142857143
0
0
31
44.2857142857143
70
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
151
RepliedTo
5/30/2019 5:18:35 PM
You may find trying to do the opposite easier, instead of learned and taking what you learn somewhere, find something you want to do and learn how to do it by breaking it into learnable chunks. Lets use web scraping as an example.
1) google web scrapeing utitlies python3
You see beautiful soup mentioned a lot in a few of the links you gather.
2) you now know you want to use bs4, so you google 'get basic information for a wepage beautiful soup python3'
Youve now learned to get a few items from a page and youve successfully printed them on your screen.
3) well now i have this information i want to store it? So we google how to store variables in an external file. You see json mentioned a lot, so you google storing basic information with json and python 3
4) you now have stored those few elements in a json file, but realize you want multiple pages. So back to google 'how to scrape multiple pages beautiful soup python' you see for loops mentioned a lot, so you try a for loop and it works, but then you realize its overwriting your json file, so you google how to store multiple things in a json file etc.
epib3h2
learnpython
ScreamingIsMyAir
t1_epib3h2
https://www.reddit.com/r/learnpython/comments/butj5e/amazon_scrape_trying_with_octoparse/epib3h2/
5/30/2019 5:18:35 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
epi9s5z
t1_epi9s5z
epi9s5z
0
butj5e
False
False
False
4
5
12
12
7
3.30188679245283
1
0.471698113207547
0
0
96
45.2830188679245
212
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
149
Commented
4/3/2023 3:38:19 AM
If it has infinite scroll then you can try to check if each time it loads new data is it making a post request or is it loading a new page using which is just like adding paremeters like this https://domain.com?page=10
jequpkf
webscraping
Elitedoorhugger
t1_jequpkf
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/jequpkf/
4/3/2023 3:38:19 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
129t5z7
t3_129t5z7
129t5z7
0
129t5z7
False
False
False
0
4
16
16
0
0
0
0
0
0
14
35
40
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
148
Commented
4/3/2023 3:38:12 AM
If it has infinite scroll then you can try to check if each time it loads new data is it making a post request or is it loading a new page using which is just like adding paremeters like this https://domain.com?page=10
jequova
webscraping
Elitedoorhugger
t1_jequova
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/jequova/
4/3/2023 3:38:12 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
129t5z7
t3_129t5z7
129t5z7
0
129t5z7
False
False
False
0
4
16
16
0
0
0
0
0
0
14
35
40
128, 128, 128
3
Solid
50
No
146
Commented
4/3/2023 3:48:22 PM
I found this resource useful; they also have a discord server for questions
https://substack.thewebscraping.club/
jesrb9m
webscraping
Ok-Computer9983
t1_jesrb9m
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/jesrb9m/
4/3/2023 3:48:22 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
129t5z7
t3_129t5z7
129t5z7
0
129t5z7
False
False
False
0
1
16
16
1
7.69230769230769
1
7.69230769230769
0
0
4
30.7692307692308
13
128, 128, 128
3
Solid
50
Yes
145
Commented
4/3/2023 12:55:14 AM
Try scraping with python
jeqaj1y
webscraping
ivanoski-007
t1_jeqaj1y
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/jeqaj1y/
4/3/2023 12:55:14 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
129t5z7
t3_129t5z7
129t5z7
1
129t5z7
False
False
False
0
1
16
16
0
0
0
0
0
0
3
75
4
128, 128, 128
3
Solid
50
Yes
144
RepliedTo
4/3/2023 4:11:06 AM
Ok
jeqybl5
webscraping
Majestic-Dust4427
t1_jeqybl5
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/jeqybl5/
4/3/2023 4:11:06 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jeqaj1y
t1_jeqaj1y
jeqaj1y
0
129t5z7
True
False
False
1
1
16
16
0
0
0
0
0
0
1
100
1
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
150
Posted
4/2/2023 6:12:44 PM
Hi guys, as the title says I don't know anything about web scraping and I'm trying to scrape website urls by date from (https://wordpress.com/read/search) , I used web scraper. Io, power automate and octoparse while scraping I was getting the same error in all sites, the error is that the selecter wasn't selecting everything and when I scrolled to check the selected ones were gone, when I tried to export the data I was only getting 10 to 15 urls also the WordPress page had a infinite scroll
Can you please help me, or guide/explain the error and how to fix it?
Thanks
129t5z7
webscraping
Majestic-Dust4427
t3_129t5z7
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/
4/2/2023 6:12:44 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hi everyone, i don't know anything about web scraping can you help me?
False
0.5
129t5z7
0
2
16
16
0
0
3
2.80373831775701
0
0
50
46.7289719626168
107
128, 128, 128
3.00094607379376
Solid
49.9959453980268
No
147
Commented
4/2/2023 6:15:10 PM
Also here are some points for the page
-you need to register or log in to see the "sites" Column
-the page has infinite scroll
jeos2w5
webscraping
Majestic-Dust4427
t1_jeos2w5
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/jeos2w5/
4/2/2023 6:15:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
129t5z7
t3_129t5z7
129t5z7
0
129t5z7
True
False
False
0
2
16
16
0
0
0
0
0
0
9
36
25
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
143
Commented
4/2/2023 10:16:11 PM
Where is your code? Difficult to help without...
jeppy41
webscraping
GillesQuenot
t1_jeppy41
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/jeppy41/
4/2/2023 10:16:11 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
129t5z7
t3_129t5z7
129t5z7
1
129t5z7
False
False
False
0
2
16
16
0
0
1
12.5
0
0
3
37.5
8
128, 128, 128
3
Solid
50
Yes
142
RepliedTo
4/3/2023 4:10:58 AM
I didn't use code, I used some softwares like octoparse and power automate
jeqyb2p
webscraping
Majestic-Dust4427
t1_jeqyb2p
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/jeqyb2p/
4/3/2023 4:10:58 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jeppy41
t1_jeppy41
jeppy41
1
129t5z7
True
False
False
1
1
16
16
0
0
0
0
0
0
7
53.8461538461538
13
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
141
RepliedTo
4/3/2023 8:43:50 AM
No experiences with `no-code`. I'm an experienced developer. Can do it for you, feel free to PM (not free).
jerj7q0
webscraping
GillesQuenot
t1_jerj7q0
https://www.reddit.com/r/webscraping/comments/129t5z7/hi_everyone_i_dont_know_anything_about_web/jerj7q0/
4/3/2023 8:43:50 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jeqyb2p
t1_jeqyb2p
jeqyb2p
0
129t5z7
False
False
False
2
2
16
16
2
10
0
0
0
0
6
30
20
128, 128, 128
3
Solid
50
No
140
Posted
4/7/2021 1:29:15 PM
Hi there!
I am very new to web scraping. I am looking to extract all replies to a tweet into an Excel file.
Here are the tweet attributes I am trying to collect: Name, twitter handle, date, body text of tweet, number of replies, number of retweets, and number of likes.
I am currently using Octoparse, but after 90 minutes it has only collected \~40 tweets. The responses I am trying to gather are in the thousands.
There must be a better way?
Any insight the community can provide will be greatly appreciated. Thank you so much.
mm23ao
webscraping
cl1ffhanger
t3_mm23ao
https://www.reddit.com/r/webscraping/comments/mm23ao/looking_for_advice_need_to_extract_all_replies_to/
4/7/2021 1:29:15 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Looking for advice - need to extract all replies to a single tweet
False
1
mm23ao
0
1
10
10
5
5.15463917525773
0
0
0
0
43
44.3298969072165
97
128, 128, 128
3
Solid
50
No
139
Commented
4/7/2021 2:14:26 PM
Are you familiar with Python? There’s a library called Tweepy that might be able to help.
gtouvuo
webscraping
sudodoyou
t1_gtouvuo
https://www.reddit.com/r/webscraping/comments/mm23ao/looking_for_advice_need_to_extract_all_replies_to/gtouvuo/
4/7/2021 2:14:26 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
mm23ao
t3_mm23ao
mm23ao
1
mm23ao
False
False
False
0
1
10
10
0
0
0
0
0
0
6
35.2941176470588
17
128, 128, 128
3
Solid
50
No
138
RepliedTo
4/13/2021 1:48:38 PM
You will very quickly hit the rate limits.
gudc4jk
webscraping
universecoder
t1_gudc4jk
https://www.reddit.com/r/webscraping/comments/mm23ao/looking_for_advice_need_to_extract_all_replies_to/gudc4jk/
4/13/2021 1:48:38 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gtouvuo
t1_gtouvuo
gtouvuo
0
mm23ao
False
False
False
1
1
10
10
0
0
1
12.5
0
0
4
50
8
128, 128, 128
3
Solid
50
No
137
Posted
9/15/2020 1:49:12 PM
https://oxofiles.com/octoparse-download/
it94dy
software
JafirGull
t3_it94dy
https://www.reddit.com/r/software/comments/it94dy/octoparse_download_2020_latest_for_windows_10_8_7/
9/15/2020 1:49:12 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse Download (2020 Latest) for Windows 10, 8, 7 | OXO Files
False
1
it94dy
0
1
1
1
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
136
Commented
6/11/2020 4:23:49 PM
Its handy for this level of information but it doesn't gather the website of the business which I find is most important to people. Once you get into scraping the actual pages scraping yelp is much harder due to the security they have and a simple Octoparse wont do the trick.
For quick results with name, address, and phone number it seems pretty handy .I don't own it though.
ftpjs1z
webscraping
B33rNuts
t1_ftpjs1z
https://www.reddit.com/r/webscraping/comments/h0tq3h/how_to_scrape_yelp_data_to_excel_business_names/ftpjs1z/
6/11/2020 4:23:49 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h0tq3h
t3_h0tq3h
h0tq3h
1
h0tq3h
False
False
False
0
2
5
5
4
5.79710144927536
1
1.44927536231884
0
0
26
37.6811594202899
69
128, 128, 128
3
Solid
50
Yes
135
RepliedTo
6/12/2020 2:28:23 AM
But I think you can modify the crawler and get websites of business pretty easily with Octoparse... Like just click on the website link and extract it
ftrij6j
webscraping
Millyfang
t1_ftrij6j
https://www.reddit.com/r/webscraping/comments/h0tq3h/how_to_scrape_yelp_data_to_excel_business_names/ftrij6j/
6/12/2020 2:28:23 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ftpjs1z
t1_ftpjs1z
ftpjs1z
1
h0tq3h
True
False
False
1
1
5
5
1
3.7037037037037
0
0
0
0
11
40.7407407407407
27
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
134
RepliedTo
6/12/2020 3:25:49 AM
To do this for yelp you have to click though each listing. But yelp hires DistilNetworks to police their site for bots. There is no way such a simple script will be able to get past it without being banned.
ftro03b
webscraping
B33rNuts
t1_ftro03b
https://www.reddit.com/r/webscraping/comments/h0tq3h/how_to_scrape_yelp_data_to_excel_business_names/ftro03b/
6/12/2020 3:25:49 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ftrij6j
t1_ftrij6j
ftrij6j
1
h0tq3h
False
False
False
2
2
5
5
0
0
0
0
0
0
18
45
40
128, 128, 128
3
Solid
50
No
133
RepliedTo
6/12/2020 12:09:57 PM
This is true. I'd really like to see a video of Octoparse getting around these anti-bot frameworks and successfully scraping non-trivial amounts of data from Yelp.
ful8ykx
webscraping
matty_fu
t1_ful8ykx
https://www.reddit.com/r/webscraping/comments/h0tq3h/how_to_scrape_yelp_data_to_excel_business_names/ful8ykx/
6/12/2020 12:09:57 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ftro03b
t1_ftro03b
ftro03b
0
h0tq3h
False
False
False
3
1
5
5
1
3.57142857142857
1
3.57142857142857
0
0
15
53.5714285714286
28
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
232
Posted
6/4/2020 2:59:48 AM
https://www.youtube.com/watch?v=TeWDWQSRIZI&feature=youtu.be
gw8v79
webscraping
Millyfang
t3_gw8v79
https://www.reddit.com/r/webscraping/comments/gw8v79/how_to_scrape_historial_twitter_data_with/
6/4/2020 2:59:48 AM
1/1/0001 12:00:00 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape historial twitter data with Octoparse
False
0.62
gw8v79
0
29
5
5
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
230
Commented
6/4/2020 3:01:10 AM
You can [click here](https://www.octoparse.com/download?re=) to install Octoparse on your computer. Now, let’s take a look at how to build a Twitter crawler within 3 minutes.
Step 1: Input the URL and build a pagination
Step 2: Build a loop item to extract the data
Step 3: Modify the pagination setting and execute the crawler
fstjj7w
webscraping
Millyfang
t1_fstjj7w
https://www.reddit.com/r/webscraping/comments/gw8v79/how_to_scrape_historial_twitter_data_with/fstjj7w/
6/4/2020 3:01:10 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gw8v79
t3_gw8v79
gw8v79
1
gw8v79
True
False
False
0
29
5
5
0
0
0
0
0
0
31
50.8196721311475
61
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
229
Posted
6/5/2020 3:04:23 AM
https://www.octoparse.com/blog/best-web-scraper-for-mac?re=
gwwgw5
webscraping
Millyfang
t3_gwwgw5
https://www.reddit.com/r/webscraping/comments/gwwgw5/best_web_scraper_for_mac_octoparse/
6/5/2020 3:04:23 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Best web scraper for Mac: Octoparse
False
0.33
gwwgw5
0
29
5
5
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
228
Posted
6/12/2020 6:26:52 AM
[https://www.octoparse.com/blog/how-to-scrape-yelp-data-to-excel](https://www.octoparse.com/blog/how-to-scrape-yelp-data-to-excel?re=)
[https://youtu.be/yu8vUFIMYzE](https://youtu.be/yu8vUFIMYzE)
## Step 1: Input the Yelp website URL to build a web crawler
## Step 2: Check the pagination setting and the data preview
## Step 3: Create your workflow and execute the Yelp crawler
h7fgrs
pythontips
Millyfang
t3_h7fgrs
https://www.reddit.com/r/pythontips/comments/h7fgrs/how_to_extract_yelp_data_to_excel_with_octoparse/
6/12/2020 6:26:52 AM
1/1/0001 12:00:00 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Extract Yelp Data to Excel with Octoparse in 5 mins
False
0.76
h7fgrs
0
29
5
5
2
3.07692307692308
0
0
0
0
35
53.8461538461538
65
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
227
Posted
6/11/2020 6:31:47 AM
To get data from Yelp, it only takes 5 minutes to build a crawler with [Octoparse](https://www.octoparse.com/download?re=).
**Step 1: Input** **the Yelp website URL** **to build a web crawler**
You can do this by simple copy-and-paste. Give it a few seconds. Octoparse will detect the webpage data automatically. Once the detection is done, you can see data fields highlighted in red. This means that all the highlighted data is preselected by the bot.
https://reddit.com/link/h0tq3h/video/bkm6w61bf8451/player
**Step 2: Check the pagination setting and the data preview**
Usually, the pagination button is auto-detected, and you can check its position. But if that’s not the case, you can easily select the button manually by clicking on “edit” on the Tips panel and confirm your selection.
There is a data preview section below that allows you to preview your data at the bottom and choose how you'd like the data to appear. For example, you can edit the names of the data fields, change the sequence or delete them.
**Step 3: Create your workflow and execute the Yelp crawler**
Once you’ve made sure the data columns look perfect, simply hit “create workflow” and Octoparse will auto-generate a scraping workflow for you on the left-hand side. The workflow tells us that our crawler will extract the listing data one by one on the first page, and then head to the following pages to repeat the extraction on each page.
You can choose to run your crawler on your computer or on Octoparse cloud servers. We usually recommend the latter as it allows you to schedule your extractions and can get data for you while you are sleeping. But the local extraction also works great for a one-time project. It is totally up to you.
h0tq3h
webscraping
Millyfang
t3_h0tq3h
https://www.reddit.com/r/webscraping/comments/h0tq3h/how_to_scrape_yelp_data_to_excel_business_names/
6/11/2020 6:31:47 AM
6/11/2020 7:18:58 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Yelp Data to Excel: Business Names, Phone Numbers, Addresses, Reviews...
False
0.5
h0tq3h
0
29
5
5
4
1.33333333333333
1
0.333333333333333
0
0
143
47.6666666666667
300
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
226
Posted
6/4/2020 2:44:47 AM
https://v.redd.it/z2hkj2qt3t251
gw8n4r
scrapinghub
Millyfang
t3_gw8n4r
https://www.reddit.com/r/scrapinghub/comments/gw8n4r/how_to_extract_tweets_from_twitter_with_octoparse/
6/4/2020 2:44:47 AM
1/1/0001 12:00:00 AM
False
False
4
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to extract tweets from Twitter with Octoparse in 5 minutes
False
0.75
gw8n4r
0
29
5
5
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
225
Commented
6/4/2020 2:45:42 AM
Here's a way to scrape Twitter data in 5 minutes without using Twitter API, Tweepy, Python, or writing a single line of code-using an automated web scraping tool - [Octoparse](http://www.octoparse.com/?re=).
As Octoparse simulates human interaction with a webpage, it allows you to pull all the information you see on any website, such as Twitter.
For example, you can [easily extract Tweets of a handler, tweets containing certain hashtags, or posted within a specific time frame, etc](https://www.octoparse.com/blog/how-to-extract-data-from-twitter?re=). All you need to do is to grab the URL of your target webpage and paste it into Octoparse built-in browser. Within a few point-and-clicks, you will be able to create a crawler from scratch by yourself. When the extraction is completed, you can export the data into Excel sheets, CSV, HTML, SQL, or you can stream it into your database in real-time via Octoparse APIs.
fsthylf
scrapinghub
Millyfang
t1_fsthylf
https://www.reddit.com/r/scrapinghub/comments/gw8n4r/how_to_extract_tweets_from_twitter_with_octoparse/fsthylf/
6/4/2020 2:45:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gw8n4r
t3_gw8n4r
gw8n4r
0
gw8n4r
True
False
False
0
29
5
5
1
0.613496932515337
1
0.613496932515337
0
0
86
52.760736196319
163
128, 128, 128
3
Solid
50
No
132
Commented
6/11/2020 9:58:42 AM
Octoparse is an amazaing tool for web scraping!
ftoip4t
webscraping
anon970529
t1_ftoip4t
https://www.reddit.com/r/webscraping/comments/h0tq3h/how_to_scrape_yelp_data_to_excel_business_names/ftoip4t/
6/11/2020 9:58:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h0tq3h
t3_h0tq3h
h0tq3h
0
h0tq3h
False
False
False
0
1
5
5
0
0
0
0
0
0
5
62.5
8
128, 128, 128
3
Solid
50
No
130
Commented
7/10/2020 7:42:59 PM
Hey Juan, I've got a little knowledge of Octoparse, enough to know its mostly a point and click interface. I think for something like this you probably want something that supports some code. I think a good middle ground is something like [Apify.com](https://Apify.com)
If you were willing to drop me a line at [bd@scrapediary.com](mailto:bd@scrapediary.com) I'd be more then happy to lend a hand and help you build something.
Ps, at [scrapediary.com](https://scrapediary.com) I write a newsletter covering this sort of stuff if you're interested
fxkad37
webscraping
brycedavies
t1_fxkad37
https://www.reddit.com/r/webscraping/comments/horcqt/need_some_help_with_octoparse/fxkad37/
7/10/2020 7:42:59 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
horcqt
t3_horcqt
horcqt
0
horcqt
False
False
False
0
1
46
46
5
5.15463917525773
0
0
0
0
43
44.3298969072165
97
128, 128, 128
3
Solid
50
No
131
Posted
7/10/2020 3:50:25 PM
My roommate and I are trying to extract some data from a website.
We need Octoparse to look at row 1 of a table, extract that data, then click on a link in said row, click through to another page from that link, and record the data on the same row, then loop and do so for about 2000 entries on a paginated table.
Problem is, there are several entries where that second bit of data is either empty or entirely missing.
Is there a streamlined way to do this in the workflow?
Reward for the first person to come up with an answer.
horcqt
webscraping
JuanReasley
t3_horcqt
https://www.reddit.com/r/webscraping/comments/horcqt/need_some_help_with_octoparse/
7/10/2020 3:50:25 PM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Need some help with Octoparse
False
0.67
horcqt
0
1
46
46
2
1.92307692307692
1
0.961538461538462
0
0
41
39.4230769230769
104
128, 128, 128
3
Solid
50
No
129
Commented
7/10/2020 6:29:44 PM
You'll need to come up with a way to programmatically tell if the table row you're looking at has the valid data or not. It is also yourself easier to scan the whole table and store all of the links and data from that table, then go back and load each of those pages after you have already scanned the entire table. I really like doing this kind of thing asynchronously with aws lambda and sqs
fxk1ek5
webscraping
justinprather
t1_fxk1ek5
https://www.reddit.com/r/webscraping/comments/horcqt/need_some_help_with_octoparse/fxk1ek5/
7/10/2020 6:29:44 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
horcqt
t3_horcqt
horcqt
0
horcqt
False
False
False
0
1
46
46
1
1.31578947368421
0
0
0
0
35
46.0526315789474
76
128, 128, 128
3
Solid
50
No
128
Posted
11/4/2017 10:36:36 PM
Hello:
I’m having trouble with scraping just one field of text from Amazon.
In short: I often cannot see the Arrives By dates in the Offer Listings, and I can’t figure out why.
Consider the listing:
https://www.amazon.com/gp/offer-listing/B001VJ5A2W/ref=olp_f_freeShipping?ie=UTF8&f_all=true&f_freeShipping=true&f_new=true&f_primeEligible=true
And here is the ever-elusive box: https://imgur.com/a/WWTGp
When I pull up this page in Chrome or Edge, it shows a box which states “Arrives by Mon, Nov. 6.) on the top right. It doesn’t show up in internet explorer. It doesn’t show up in Firefox, even when I spoof my User Agent to Chrome or Edge!
I’m trying to use various scrappers to get this data. Octoparse can see it just fine, but what it can do appears to be limited (can’t schedule tasks on local machine, and off-loading to their cloud is too slow). Content Grabber 2 appears very powerful, yet I cannot see Arrived By information with this. I viewed the HTML in Content Grabber 2 and the Arrives by date div wasn’t even there! FMiner Pro also can’t see the Arrives By Data.
I can’t see what’s so special about this box—it just looks like a normal div?
If I could just figure out what’s so unique about this box that it always appears for some browsers but never for others, then perhaps I could fix this! But I’m completely lost!
Any ideas! Thank you in advance!
7attjl
scrapinghub
Lucid-Dreamx
t3_7attjl
https://www.reddit.com/r/scrapinghub/comments/7attjl/cant_scrape_arrives_by_date_from_amazons_offer/
11/4/2017 10:36:36 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Can't Scrape "Arrives By" date from Amazon's Offer Listings.
False
1
7attjl
0
1
69
69
5
2.08333333333333
5
2.08333333333333
0
0
91
37.9166666666667
240
128, 128, 128
3
Solid
50
No
127
Commented
11/21/2017 1:35:26 PM
Try using some other online services. Some websites like Moz or Datahen will give you info about the problem solutions, or maybe solve your problem. They are quite popular right now, so I think they do the job decently.
dq50hpl
scrapinghub
Haiko_Hayn
t1_dq50hpl
https://www.reddit.com/r/scrapinghub/comments/7attjl/cant_scrape_arrives_by_date_from_amazons_offer/dq50hpl/
11/21/2017 1:35:26 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
7attjl
t3_7attjl
7attjl
0
7attjl
False
False
False
0
1
69
69
2
5.12820512820513
2
5.12820512820513
0
0
16
41.025641025641
39
128, 128, 128
3
Solid
50
No
126
Commented
10/29/2021 8:41:49 AM
Spam.
hihnfvy
webscraping
nubela
t1_hihnfvy
https://www.reddit.com/r/webscraping/comments/qi88b9/introducing_the_new_octoparse_84/hihnfvy/
10/29/2021 8:41:49 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
qi88b9
t3_qi88b9
qi88b9
0
qi88b9
False
False
False
0
1
3
3
0
0
0
0
0
0
1
100
1
128, 128, 128
3
Solid
50
No
125
Posted
6/21/2021 9:19:55 AM
There are many web scraping tools used to collect various data. Tools like Luminati, Parsehub, and many others.
Today I would suggest you try out this tool called **Octoparse**. A cloud-based, Coding-free web scraping tool that helps you to extract data from any website.
There are some [prebuilt templates](https://helpcenter.octoparse.com/hc/en-us/articles/900003158843-Task-Templates-Version-8-) for social media scraping, or you can use the “Advanced Mode” feature to scrape any webpage you like by:
1. Click “+Task” to initiate a new task under the [Advanced Mode](https://helpcenter.octoparse.com/hc/en-us/articles/360018281431-Advanced-Mode).
2. Insert the URL of the selected webpage in the text box.
3. Click on “Save URL.”
Find more information here **https://www.octoparse.com/**
o4r82b
u_pekinson
pekinson
t3_o4r82b
https://www.reddit.com/r/u_pekinson/comments/o4r82b/how_do_i_collect_data_for_social_network_analysis/
6/21/2021 9:19:55 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How do I Collect Data for Social Network Analysis
True
0.99
o4r82b
0
1
1
1
4
3.05343511450382
1
0.763358778625954
0
0
71
54.1984732824427
131
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
122
Commented
4/1/2022 5:36:17 AM
Did you ever find a working solution for this? I use cypress (maybe overkill) with great success, navigating through complex user interfaces to get data.
i2y6l19
programmingrequests
TryCatchLife
t1_i2y6l19
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/i2y6l19/
4/1/2022 5:36:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
sha43h
t3_sha43h
sha43h
0
sha43h
False
False
False
0
4
23
23
2
8
2
8
0
0
10
40
25
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
121
Commented
4/1/2022 5:36:17 AM
Did you ever find a working solution for this? I use cypress (maybe overkill) with great success, navigating through complex user interfaces to get data.
i2y6l19
programmingrequests
TryCatchLife
t1_i2y6l19
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/i2y6l19/
4/1/2022 5:36:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
sha43h
t3_sha43h
sha43h
0
sha43h
False
False
False
0
4
23
23
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
120
Commented
2/1/2022 5:38:27 PM
Power Automate Desktop from Microsoft is pretty good tool to look into if you want to do it yourself with little programming. I also see a reporting tab, double check if there's any data there you can download direct.
hv5q9io
programmingrequests
GSxHidden
t1_hv5q9io
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hv5q9io/
2/1/2022 5:38:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
sha43h
t3_sha43h
sha43h
1
sha43h
False
False
False
0
4
23
23
2
5.12820512820513
0
0
0
0
17
43.5897435897436
39
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
119
RepliedTo
2/18/2022 11:15:35 PM
I did upgrade to Win11 so that I could fire up Power Automate. Getting deep into the webpage, I ran into the exact same errors I did when I was trying to do it with Octoparse. Theres something with the login cookies that isnt being passed between steps. I can get the first truck info fine, but looping back in for the next it fails.
hxie8f9
programmingrequests
RavenTBK
t1_hxie8f9
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hxie8f9/
2/18/2022 11:15:35 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hv5q9io
t1_hv5q9io
hv5q9io
1
sha43h
True
False
False
1
4
23
23
1
1.53846153846154
2
3.07692307692308
0
0
28
43.0769230769231
65
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
118
Commented
2/1/2022 5:38:27 PM
Power Automate Desktop from Microsoft is pretty good tool to look into if you want to do it yourself with little programming. I also see a reporting tab, double check if there's any data there you can download direct.
hv5q9io
programmingrequests
GSxHidden
t1_hv5q9io
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hv5q9io/
2/1/2022 5:38:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
sha43h
t3_sha43h
sha43h
1
sha43h
False
False
False
0
4
23
23
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
117
RepliedTo
2/18/2022 11:15:35 PM
I did upgrade to Win11 so that I could fire up Power Automate. Getting deep into the webpage, I ran into the exact same errors I did when I was trying to do it with Octoparse. Theres something with the login cookies that isnt being passed between steps. I can get the first truck info fine, but looping back in for the next it fails.
hxie8f9
programmingrequests
RavenTBK
t1_hxie8f9
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hxie8f9/
2/18/2022 11:15:35 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hv5q9io
t1_hv5q9io
hv5q9io
1
sha43h
True
False
False
1
4
23
23
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
116
RepliedTo
3/3/2022 11:15:22 PM
If you still haven't found a solution I'd like to offer my help.
I'm developing an automation/scraping tool on my free time and would like to see what people might use it for.
Feel free to send me a DM if still interested.
hz8w97l
programmingrequests
tnilk
t1_hz8w97l
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hz8w97l/
3/3/2022 11:15:22 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxie8f9
t1_hxie8f9
hxie8f9
0
sha43h
False
False
False
2
4
23
23
2
4.54545454545455
0
0
0
0
19
43.1818181818182
44
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
115
RepliedTo
3/3/2022 11:15:22 PM
If you still haven't found a solution I'd like to offer my help.
I'm developing an automation/scraping tool on my free time and would like to see what people might use it for.
Feel free to send me a DM if still interested.
hz8w97l
programmingrequests
tnilk
t1_hz8w97l
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hz8w97l/
3/3/2022 11:15:22 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hxie8f9
t1_hxie8f9
hxie8f9
0
sha43h
False
False
False
2
4
23
23
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
114
Commented
2/1/2022 5:22:57 AM
Doesn't seem very complicated to do. Can be done in pretty much any language really. Might give it a try. DM me so we can work out some of the details.
hv3i522
programmingrequests
Ascor8522
t1_hv3i522
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hv3i522/
2/1/2022 5:22:57 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
sha43h
t3_sha43h
sha43h
1
sha43h
False
False
False
0
4
23
23
2
6.45161290322581
1
3.2258064516129
0
0
10
32.258064516129
31
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
113
RepliedTo
2/18/2022 11:15:44 PM
sent pm
hxie96e
programmingrequests
RavenTBK
t1_hxie96e
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hxie96e/
2/18/2022 11:15:44 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hv3i522
t1_hv3i522
hv3i522
0
sha43h
True
False
False
1
4
23
23
0
0
0
0
0
0
2
100
2
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
112
Commented
2/1/2022 5:22:57 AM
Doesn't seem very complicated to do. Can be done in pretty much any language really. Might give it a try. DM me so we can work out some of the details.
hv3i522
programmingrequests
Ascor8522
t1_hv3i522
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hv3i522/
2/1/2022 5:22:57 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
sha43h
t3_sha43h
sha43h
1
sha43h
False
False
False
0
4
23
23
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
111
RepliedTo
2/18/2022 11:15:44 PM
sent pm
hxie96e
programmingrequests
RavenTBK
t1_hxie96e
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/hxie96e/
2/18/2022 11:15:44 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hv3i522
t1_hv3i522
hv3i522
0
sha43h
True
False
False
1
4
23
23
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
124
Posted
1/31/2022 7:42:59 PM
Well, here's the beef: At work I regularly have to log in to a particular website, and manually pull odometer readings off of the individual vehicles on a semi-regular basis. The data should be saved in a CSV, showing unit number, odometer reading, and date/time read -- which I could easily link with my spreadsheets for tracking maintenance.
I previously did a script within OctoParse which did what I needed only after I manually logged in and set the cookies. Now it doesn't work at all, and is highly annoying to be stuck doing it manually again for \~50ish trucks.
There is an acquaintance who has written scripts in Python which can do what I need to do (who I subscribe to his services) for another end. He's afraid to share/copy/license/sell these scripts to me for fear of starting my own business and becoming his competitor. So that avenue pretty much off the table.
I used to be pretty nifty with Perl back in the day, but now everything is dynamic html and CSS nonsense, it has all gotten too deep for me. I don't believe I would have any issues maintaining the script as time goes on...just the creation bit is where I'm completely flaberghasted.
I've attached a video giving the details of what I have a need to accomplish.
edit: I see the video was spontaneously expunged from the post. I threw it up on YT:
[https://youtu.be/BcW6vYNdR0c](https://youtu.be/BcW6vYNdR0c)
sha43h
programmingrequests
RavenTBK
t3_sha43h
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/
1/31/2022 7:42:59 PM
1/31/2022 7:51:57 PM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Have a need to regularly scrape fleet odometer readings from website and save in CSV
False
1
sha43h
0
4
23
23
8
3.18725099601594
6
2.39043824701195
0
0
101
40.2390438247012
251
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
123
Posted
1/31/2022 7:42:59 PM
Well, here's the beef: At work I regularly have to log in to a particular website, and manually pull odometer readings off of the individual vehicles on a semi-regular basis. The data should be saved in a CSV, showing unit number, odometer reading, and date/time read -- which I could easily link with my spreadsheets for tracking maintenance.
I previously did a script within OctoParse which did what I needed only after I manually logged in and set the cookies. Now it doesn't work at all, and is highly annoying to be stuck doing it manually again for \~50ish trucks.
There is an acquaintance who has written scripts in Python which can do what I need to do (who I subscribe to his services) for another end. He's afraid to share/copy/license/sell these scripts to me for fear of starting my own business and becoming his competitor. So that avenue pretty much off the table.
I used to be pretty nifty with Perl back in the day, but now everything is dynamic html and CSS nonsense, it has all gotten too deep for me. I don't believe I would have any issues maintaining the script as time goes on...just the creation bit is where I'm completely flaberghasted.
I've attached a video giving the details of what I have a need to accomplish.
edit: I see the video was spontaneously expunged from the post. I threw it up on YT:
[https://youtu.be/BcW6vYNdR0c](https://youtu.be/BcW6vYNdR0c)
sha43h
programmingrequests
RavenTBK
t3_sha43h
https://www.reddit.com/r/programmingrequests/comments/sha43h/have_a_need_to_regularly_scrape_fleet_odometer/
1/31/2022 7:42:59 PM
1/31/2022 7:51:57 PM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Have a need to regularly scrape fleet odometer readings from website and save in CSV
False
1
sha43h
0
4
23
23
128, 128, 128
3
Solid
50
No
110
Commented
2/11/2022 8:17:29 AM
Try something like this:
^(.+)\1
Replace: $1 or \1
It matches lines that have duplicate text and replaces it with only the first instance of the given text.
[Regex101 demo](https://regex101.com/r/ROVnrB/1)
hwhca75
regex
whereIsMyBroom
t1_hwhca75
https://www.reddit.com/r/regex/comments/spu36c/remove_duplicate_text/hwhca75/
2/11/2022 8:17:29 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
spu36c
t3_spu36c
spu36c
0
spu36c
False
False
False
0
1
2
2
0
0
0
0
0
0
16
44.4444444444444
36
128, 128, 128
3
Solid
50
Yes
109
Commented
2/11/2022 8:07:46 AM
Well it's very relevant to include what app you are using to do this replacement with.
If every line is repeated just twice, can't you just take half of the string? Why regex?
Either way, if one uses regex replace functions the general concept would be to match upto you find exactly the same text and lock that expression in anchors and replace the whole string with the 1st capture group;
^(.+)\1$
See an online [demo](https://regex101.com/r/LPGa5W/1)
hwhbjk5
regex
BarneField
t1_hwhbjk5
https://www.reddit.com/r/regex/comments/spu36c/remove_duplicate_text/hwhbjk5/
2/11/2022 8:07:46 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
spu36c
t3_spu36c
spu36c
1
spu36c
False
False
False
0
1
2
2
1
1.21951219512195
0
0
0
0
41
50
82
128, 128, 128
3
Solid
50
Yes
108
RepliedTo
2/11/2022 9:29:59 AM
Sorry for incomplete info, I am using Octoparse to scrape contact profile details from Linkedin. Basically trying to see if old prospects have changed there companies or if there titles have changed so I need to go to each profile and check.
Now when I try to scrape its really hard to find Xpaths for each element so I am scraping complete Experiance section and it gives double text. I have 2 options here
1. use spreadsheet and with the help of formula remove duplicate text
2. use regix tool to remove duplicate text
Here is the text which is extracted from profiles
https://docs.google.com/spreadsheets/d/1iflvQaYTHyxZYjAaIUQE4as3qhPiyt0bjszPEdHyNTk/edit#gid=1416613586
hwhhnnb
regex
kartikoli
t1_hwhhnnb
https://www.reddit.com/r/regex/comments/spu36c/remove_duplicate_text/hwhhnnb/
2/11/2022 9:29:59 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hwhbjk5
t1_hwhbjk5
hwhbjk5
0
spu36c
True
False
False
1
1
2
2
0
0
3
2.9126213592233
0
0
53
51.4563106796116
103
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
738
Posted
11/1/2020 4:04:55 PM
I am new to Xpath and don't have any programming background so please treat me as a newbie. I am trying to scrape Revenue details from [Rocketreach.co](https://Rocketreach.co) using Octoparse and it does well but the table format changes sometimes so the Revenue can be in second row for few companies and for others it can be in 3\~4th row. In order to grab correct value I am trying to write Xpath using text Revenue
`//strong[text()='Revenue']`
but i can't make it work to grab revenue details. Similarly I would like write more Xpaths for Phone, Funding etc.... Here is an example url for a company profile [https://rocketreach.co/qualtrics-profile\_b5c2d23cf42e0f37](https://rocketreach.co/qualtrics-profile_b5c2d23cf42e0f37)
jm4oc1
learnprogramming
kartikoli
t3_jm4oc1
https://www.reddit.com/r/learnprogramming/comments/jm4oc1/help_with_xpath/
11/1/2020 4:04:55 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Help with Xpath
False
1
jm4oc1
0
29
2
2
4
3.2520325203252
0
0
0
0
63
51.219512195122
123
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
737
Posted
5/18/2021 3:02:09 AM
I need help with Octoparse to scrape data from Google Maps search links, its easy and practically unlimited to use. Since I don't have a programing background its suits my needs perfectly. They have a tutorial for Google Maps but due to recent changes in Google its not working now [`https://helpcenter.octoparse.com/hc...-Scrape-Business-Information-from-Google-Maps`](https://helpcenter.octoparse.com/hc/en-us/articles/900002292706-Scrape-Business-Information-from-Google-Maps)
I've followed the tutorial previously and it was working fine, then google made a change that require sidebar to scroll down to see more results (currently only 10 results if we don't scroll down [`https://www.google.com/maps/search/...+TX/@29.7160081,-95.4987635,10z/data=!3m1!4b1`](https://www.google.com/maps/search/insurance+West+University+Place,+TX/@29.7160081,-95.4987635,10z/data=!3m1!4b1)) but Octoparse can't do that at the moment so I've to follow another process to scrape details. Click on first result and then click on each result one by one for that to happen I need relative Xpath for those results. Can anyone help with that?
Or Is there an alternate that can scrape data from Google Maps without coding?
nezi7k
learnprogramming
kartikoli
t3_nezi7k
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/
5/18/2021 3:02:09 AM
1/1/0001 12:00:00 AM
False
False
3
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Help with recipe for Google Maps using Octoparse
False
0.68
nezi7k
0
29
2
2
4
1.96078431372549
0
0
0
0
118
57.843137254902
204
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
736
Posted
5/18/2021 3:02:09 AM
I need help with Octoparse to scrape data from Google Maps search links, its easy and practically unlimited to use. Since I don't have a programing background its suits my needs perfectly. They have a tutorial for Google Maps but due to recent changes in Google its not working now [`https://helpcenter.octoparse.com/hc...-Scrape-Business-Information-from-Google-Maps`](https://helpcenter.octoparse.com/hc/en-us/articles/900002292706-Scrape-Business-Information-from-Google-Maps)
I've followed the tutorial previously and it was working fine, then google made a change that require sidebar to scroll down to see more results (currently only 10 results if we don't scroll down [`https://www.google.com/maps/search/...+TX/@29.7160081,-95.4987635,10z/data=!3m1!4b1`](https://www.google.com/maps/search/insurance+West+University+Place,+TX/@29.7160081,-95.4987635,10z/data=!3m1!4b1)) but Octoparse can't do that at the moment so I've to follow another process to scrape details. Click on first result and then click on each result one by one for that to happen I need relative Xpath for those results. Can anyone help with that?
Or Is there an alternate that can scrape data from Google Maps without coding?
nezi7k
learnprogramming
kartikoli
t3_nezi7k
https://www.reddit.com/r/learnprogramming/comments/nezi7k/help_with_recipe_for_google_maps_using_octoparse/
5/18/2021 3:02:09 AM
1/1/0001 12:00:00 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Help with recipe for Google Maps using Octoparse
False
0.61
nezi7k
0
29
2
2
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
735
Posted
2/11/2022 8:01:29 AM
I am trying to scrape data from Linkedin and have succeeded upto an extent but the data that is scraped have double text in each line
**customer servicescustomer servicesHouse of FraserHouse of FraserManaging DirectorManaging DirectorAdebanbutayo Trading StoresAdebanbutayo Trading StoresJan 2012 - Present · 10 yrs 2 mosJan 2012 - Present · 10 yrs 2 mos**
We can see that same exact data is repeated so I want to add a regular expression so it removes exact duplicate data and the text should look like
**customer servicesHouse of FraserManaging DirectorAdebanbutayo Trading StoresJan 2012 - Present · 10 yrs 2 mos**
I am completely new when it comes to Regex and tried few online tutorials but somehow I can't make it work. Any help would be appreciated
P.S. I am using Octoparse to scrape contact profile details from Linkedin but its really hard to find Xpaths for each element so I am scraping complete Experience section but it gives double text. Now I have 2 options here
1. use spreadsheet and with the help of formula remove duplicate text
2. use regix tool to remove duplicate text
Here is the text which is extracted from profiles
[https://docs.google.com/spreadsheets/d/1iflvQaYTHyxZYjAaIUQE4as3qhPiyt0bjszPEdHyNTk/edit#gid=1416613586](https://docs.google.com/spreadsheets/d/1iflvQaYTHyxZYjAaIUQE4as3qhPiyt0bjszPEdHyNTk/edit#gid=1416613586)
spu36c
regex
kartikoli
t3_spu36c
https://www.reddit.com/r/regex/comments/spu36c/remove_duplicate_text/
2/11/2022 8:01:29 AM
2/11/2022 9:32:42 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Remove duplicate text
False
1
spu36c
0
29
2
2
3
1.43540669856459
1
0.478468899521531
0
0
118
56.4593301435407
209
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
734
Commented
2/12/2022 4:15:33 AM
I've managed to clean the data and removed unwanted spaces in Octoparse using Regex and now the data look much cleaner. When I try to remove duplicate data the regex expression works but only for the first line and not for all https://i.imgur.com/WjEGY7I.png
hwlpggb
regex
kartikoli
t1_hwlpggb
https://www.reddit.com/r/regex/comments/spu36c/remove_duplicate_text/hwlpggb/
2/12/2022 4:15:33 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
spu36c
t3_spu36c
spu36c
0
spu36c
True
False
False
0
29
2
2
3
7.14285714285714
1
2.38095238095238
0
0
18
42.8571428571429
42
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
733
Posted
2/27/2021 6:27:39 AM
I am trying to grab address and Phone numbers for companies using Bing maps with Octoparse but I can't grab Phone numbers with default Xpath for all the search results. I've customized Xpath for Address using sibling method and it works great
//*[@id='saplacesvg']/following-sibling::div
but I can't do the same for Phone numbers, here is a test url that I am trying to work with [https://www.bing.com/maps?q=fortvale.com+UK+phone+number](https://www.bing.com/maps?q=fortvale.com+UK+phone+number)
I can see svg id="sacallsvg" looks unique and this can be used with sibling method but somehow I can't make it work.
Any help will be appreciated
lti39i
learnprogramming
kartikoli
t3_lti39i
https://www.reddit.com/r/learnprogramming/comments/lti39i/help_with_xpath_for_octoparse/
2/27/2021 6:27:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Help with Xpath for Octoparse
False
1
lti39i
0
29
2
2
5
4.23728813559322
0
0
0
0
58
49.1525423728814
118
128, 128, 128
3.02649006622517
Dash Dot Dot
49.8864711447493
No
732
Commented
3/3/2021 6:28:20 AM
Guys any help plz
gpiazdu
learnprogramming
kartikoli
t1_gpiazdu
https://www.reddit.com/r/learnprogramming/comments/lti39i/help_with_xpath_for_octoparse/gpiazdu/
3/3/2021 6:28:20 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lti39i
t3_lti39i
lti39i
0
lti39i
True
False
False
0
29
2
2
0
0
0
0
0
0
3
75
4
128, 128, 128
3
Solid
50
Yes
106
Commented
3/11/2023 2:15:17 PM
I came to similar idea when i was scrape flight tickets:
[https://github.com/PxyUp/fitter](https://github.com/PxyUp/fitter)
Probably we can work together some how for build some thing really cool
jbt15e2
webscraping
PyxRu
t1_jbt15e2
https://www.reddit.com/r/webscraping/comments/11mru7n/no_code_command_line_webscraper/jbt15e2/
3/11/2023 2:15:17 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
11mru7n
t3_11mru7n
11mru7n
1
11mru7n
False
False
False
0
1
70
70
2
5.88235294117647
0
0
0
0
17
50
34
128, 128, 128
3
Solid
50
Yes
105
RepliedTo
3/12/2023 11:57:18 AM
Nice, I’ll check it out!
jbx4lzy
webscraping
dhondtdoit
t1_jbx4lzy
https://www.reddit.com/r/webscraping/comments/11mru7n/no_code_command_line_webscraper/jbx4lzy/
3/12/2023 11:57:18 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jbt15e2
t1_jbt15e2
jbt15e2
0
11mru7n
True
False
False
1
1
70
70
1
16.6666666666667
0
0
0
0
2
33.3333333333333
6
128, 128, 128
3
Solid
50
No
107
Posted
3/9/2023 1:12:14 PM
I am currently building a webscraper, called [goskyr](https://github.com/jakopako/goskyr), that can be run from the command line and is supposed to be easily configurable. So instead of having to write code to scrape a website you'd just write a configuration snippet and run the scraper. I realize that there are a number of gui based scraping services that make it extremely easy to setup a scraping process for any website, so for people having no coding experience whatsoever that would probably be the easiest solution. I'm trying to come close to those gui based solutions in terms of functionality by providing a 'smart' way of finding potentially interesting data/fields and letting the user select a subset in a terminal based ui. Also date extraction & parsing and the newly added machine learning capability is probably worth mentioning. Still, those other, gui based solutions are really awesome, eg octoparse or scrapestorm.
I actually started this scraping project because of an idea I wanted to try, which is scraping concert data from as many websites as possible with as little effort as possible, see [https://github.com/jakopako/croncert-config](https://github.com/jakopako/croncert-config) This seems to work better and better. Still I am wondering if there are any other valid use cases for such a terminal based scraper or if it's rather niche. What do you think?
11mru7n
webscraping
dhondtdoit
t3_11mru7n
https://www.reddit.com/r/webscraping/comments/11mru7n/no_code_command_line_webscraper/
3/9/2023 1:12:14 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
No code command line webscraper
False
1
11mru7n
0
1
70
70
9
3.86266094420601
0
0
0
0
122
52.3605150214592
233
128, 128, 128
3
Solid
50
No
103
Commented
6/4/2020 10:23:25 PM
Using scrapingdog.com you can just focus on data collection rather than on the backend. Try it . It also provides a generous free pack with 1000 API calls.
fswrjn2
webscraping
yakult2450
t1_fswrjn2
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/fswrjn2/
6/4/2020 10:23:25 PM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gwn1sx
t3_gwn1sx
gwn1sx
1
gwn1sx
False
False
False
0
1
35
35
2
7.14285714285714
0
0
0
0
12
42.8571428571429
28
128, 128, 128
3
Solid
50
No
102
RepliedTo
6/4/2020 10:23:31 PM
**I found links in your comment that were not hyperlinked:**
* [scrapingdog.com](https://scrapingdog.com)
*I did the honors for you.*
***
^[delete](https://www.reddit.com/message/compose?to=%2Fu%2FLinkifyBot&subject=delete%20fswrjn2&message=Click%20the%20send%20button%20to%20delete%20the%20false%20positive.) ^| ^[information](https://np.reddit.com/u/LinkifyBot/comments/gkkf7p) ^| ^<3
fswrjzu
webscraping
LinkifyBot
t1_fswrjzu
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/fswrjzu/
6/4/2020 10:23:31 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fswrjn2
t1_fswrjn2
fswrjn2
0
gwn1sx
False
False
False
1
1
35
35
0
0
0
0
0
0
32
59.2592592592593
54
128, 128, 128
3
Solid
50
No
104
Posted
6/4/2020 6:25:00 PM
Hi All,
I'm trying to shop for a new apartment, and would like to be able to filter and sort on more values than the Trulia interface provides. Trulia property detail pages have some elements that are optional, and others that have inconsistent formatting. For example, a property may or may not have Neighborhood or Contact Phone elements, the Address may be one line or two, and the Description section may have multiple paragraphs. There is also a property Details list that I'd like to be able to pull in without having to repeat every element of the parent property.
I've tried some GUI tools like ScrapeStorm and Octoparse, and they're okay for the basic task of opening a search page and then looping through detail pages, but they don't offer an obvious solution to the problem of capturing elements that don't appear on the first detail page. I thought there might be some way to navigate through multiple detail pages to identify more elements, but if there is, I'm not seeing it.
I understand that XPath is probably the way to go here, and I'll probably want to learn that eventually, but for now this is a one-off task, so if I can avoid having to research and install a suitable scripting language, and research and install a library that supports XPath, and learn XPath, and slog through all the other intermediary yak-shaving, I'd prefer that. Using XPath to better define elements selected via GUI would be an acceptable compromise.
The current goal is to end up with an Excel data model that contains Listings and ListingDetails tables that I can combine into a pivot table for easy slicing and dicing. (If there's a better way, I'm open to suggestion.) In a perfect world, I'd do that with a free program or site that requires minimal configuration, or even has a Trulia-specific template, like I've seen for Zillow and Realtor.com. Trialware is okay, as long as it's fully functional and has no download/export limits (looking at you, ScrapeStorm).
I'm using a Windows 10 PC. I've got Python (Anaconda) and the latest Node.js installed, so if scripting is the only way, I'd go with one of those, but I have only the barest familiarity with them, so again, a GUI would be nice.
Thanks!
gwn1sx
webscraping
doshka
t3_gwn1sx
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/
6/4/2020 6:25:00 PM
6/4/2020 6:28:24 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping Trulia.com: Need tools, tips, tutorials for scraping detail pages containing inconsistent elements
False
1
gwn1sx
0
1
35
35
10
2.56410256410256
4
1.02564102564103
0
0
162
41.5384615384615
390
128, 128, 128
3.0085146641438
Dash Dot Dot
49.9635085822408
Yes
101
Commented
6/4/2020 8:47:24 PM
I scrape Craigslist similarish to how you laid this out. I use Puppeteer and select with document.querySelectAll. I could port it to scrape Trulia, but I would need some money up front to make sure you are for real. Happy to demo my current scraper.
fswfmg6
webscraping
joelcorey
t1_fswfmg6
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/fswfmg6/
6/4/2020 8:47:24 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gwn1sx
t3_gwn1sx
gwn1sx
1
gwn1sx
False
False
False
0
10
35
35
1
2.17391304347826
0
0
0
0
20
43.4782608695652
46
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
100
RepliedTo
6/4/2020 9:21:58 PM
Thanks for the offer, but I don't have a budget for this, and wouldn't ask you to work for free. I'm actually keen to do it myself, just looking for the most direct route. Given the criteria in my post, do you think Puppeteer would meet my needs?
fswk1t8
webscraping
doshka
t1_fswk1t8
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/fswk1t8/
6/4/2020 9:21:58 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fswfmg6
t1_fswfmg6
fswfmg6
1
gwn1sx
True
False
False
1
9
35
35
3
6.25
0
0
0
0
15
31.25
48
128, 128, 128
3.0085146641438
Dash Dot Dot
49.9635085822408
Yes
99
RepliedTo
6/4/2020 9:39:15 PM
Yes, Puppeteer would work well for you. Happy to demo my project and walk you through my thought process and how I built it. It might get you a ways down the road with your project.
fswm6ul
webscraping
joelcorey
t1_fswm6ul
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/fswm6ul/
6/4/2020 9:39:15 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fswk1t8
t1_fswk1t8
fswk1t8
1
gwn1sx
False
False
False
2
10
35
35
3
8.33333333333333
0
0
0
0
12
33.3333333333333
36
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
98
RepliedTo
6/4/2020 10:43:28 PM
I'd appreciate a demo, yes. What medium would we use, when are you free, and can you ballpark about how long that would take? You can DM me if the comment section isn't the best place to arrange things. Thanks!
fswtx1f
webscraping
doshka
t1_fswtx1f
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/fswtx1f/
6/4/2020 10:43:28 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fswm6ul
t1_fswm6ul
fswm6ul
1
gwn1sx
True
False
False
3
9
35
35
3
7.5
0
0
0
0
14
35
40
128, 128, 128
3.0085146641438
Dash Dot Dot
49.9635085822408
Yes
97
RepliedTo
6/4/2020 11:30:40 PM
I don't mind if other people see this, but I do appreciate the suggestion. So unless it's somehow rude to have a public conversation here, I see no issue. Let's do 10AM West Coast / U.S. time tomorrow. I am generally available for whatever time, though. :)
fswzbmc
webscraping
joelcorey
t1_fswzbmc
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/fswzbmc/
6/4/2020 11:30:40 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fswtx1f
t1_fswtx1f
fswtx1f
1
gwn1sx
False
False
False
4
10
35
35
2
4.34782608695652
2
4.34782608695652
0
0
18
39.1304347826087
46
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
Yes
96
RepliedTo
6/4/2020 11:36:54 PM
That should be okay, barring anything springing up with work. Do you have a preference for meeting type? If not, I'll default to Google Meet. About how long, do you think?
fsx00xl
webscraping
doshka
t1_fsx00xl
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/fsx00xl/
6/4/2020 11:36:54 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fswzbmc
t1_fswzbmc
fswzbmc
1
gwn1sx
True
False
False
5
9
35
35
1
3.2258064516129
0
0
0
0
12
38.7096774193548
31
128, 128, 128
3.0085146641438
Dash Dot Dot
49.9635085822408
Yes
95
RepliedTo
6/4/2020 11:52:47 PM
It would only take like 5-10 minutes. We can do Zoom or Google, that is fine. I also have Skype.
fsx1sfi
webscraping
joelcorey
t1_fsx1sfi
https://www.reddit.com/r/webscraping/comments/gwn1sx/scraping_truliacom_need_tools_tips_tutorials_for/fsx1sfi/
6/4/2020 11:52:47 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fsx00xl
t1_fsx00xl
fsx00xl
0
gwn1sx
False
False
False
6
10
35
35
1
4.76190476190476
0
0
0
0
6
28.5714285714286
21
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
94
RepliedTo
4/11/2023 5:07:24 PM
The website is [https://rollbit.com/nft/sportsbot/marketplace](https://rollbit.com/nft/sportsbot/marketplace), would like to capture the information on each 'card' shown in the list. The data not being captured intermittently is the 'listed price' (the most important info lol) . All other information I can capture
jfulzhc
webscraping
VenetianArsenale
t1_jfulzhc
https://www.reddit.com/r/webscraping/comments/12iq1sf/im_so_close_can_anyone_assist_why_would_some/jfulzhc/
4/11/2023 5:07:24 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jfuld8d
t1_jfuld8d
jfuld8d
0
12iq1sf
False
False
False
1
4
71
71
1
2.04081632653061
0
0
0
0
25
51.0204081632653
49
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
93
RepliedTo
4/11/2023 5:05:01 PM
Hey thank you for your response, this is a no-code webscraper im using called octoparse. This the only tool im using, nothing else
jfulmaq
webscraping
VenetianArsenale
t1_jfulmaq
https://www.reddit.com/r/webscraping/comments/12iq1sf/im_so_close_can_anyone_assist_why_would_some/jfulmaq/
4/11/2023 5:05:01 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
jfuld8d
t1_jfuld8d
jfuld8d
0
12iq1sf
False
False
False
1
4
71
71
1
4.16666666666667
0
0
0
0
10
41.6666666666667
24
128, 128, 128
3
Solid
50
No
92
Posted
4/17/2023 2:16:53 AM
[removed]
12oy6yw
webscraping
Manonmo
t3_12oy6yw
https://www.reddit.com/r/webscraping/comments/12oy6yw/someone_can_borrow_me_an_account_premium_octoparse/
4/17/2023 2:16:53 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
someone can borrow me an account premium octoparse ?
False
0.25
12oy6yw
0
1
1
1
0
0
0
0
0
0
1
100
1
128, 128, 128
3
Solid
50
No
90
Commented
11/16/2022 6:32:47 PM
What does it even do?
iwmcr8o
webscraping
05_legend
t1_iwmcr8o
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/iwmcr8o/
11/16/2022 6:32:47 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
yl2xt5
t3_yl2xt5
yl2xt5
0
yl2xt5
False
False
False
0
1
4
4
0
0
0
0
0
0
0
0
5
128, 128, 128
3
Solid
50
Yes
88
RepliedTo
2/21/2023 7:13:14 AM
man please Enlighten me, I was thinking the same but couldn't have a clue where to start from
j9e7c0e
webscraping
AbdKhaled
t1_j9e7c0e
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/j9e7c0e/
2/21/2023 7:13:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iyp42h4
t1_iyp42h4
iyp42h4
1
yl2xt5
False
False
False
1
1
4
4
1
5.55555555555556
0
0
0
0
6
33.3333333333333
18
128, 128, 128
3
Solid
50
Yes
87
RepliedTo
2/26/2023 4:07:04 AM
Press ctrl S on any webpage to download all the html of that page, then design a script to parse through the html code to get the data you want. Then go back and create a script to automatically press ctrl S for various different webpages to download the html for a lot of different web pages.
When I realized you could just download all the html for a webpage, my world was opened
ja1mshy
webscraping
AtwoodEnterprise
t1_ja1mshy
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/ja1mshy/
2/26/2023 4:07:04 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
j9e7c0e
t1_j9e7c0e
j9e7c0e
0
yl2xt5
False
False
False
2
1
4
4
0
0
0
0
0
0
32
43.2432432432432
74
128, 128, 128
3
Solid
50
No
89
Commented
12/3/2022 1:36:17 AM
I did pretty good with it when I used it, but it just wasn’t as efficient as I liked, so I went and taught myself to make my own webscaper and mine is much better lol
iyp42h4
webscraping
AtwoodEnterprise
t1_iyp42h4
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/iyp42h4/
12/3/2022 1:36:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
yl2xt5
t3_yl2xt5
yl2xt5
1
yl2xt5
False
False
False
0
1
4
4
5
13.5135135135135
0
0
0
0
10
27.027027027027
37
128, 128, 128
3
Solid
50
Yes
85
RepliedTo
11/5/2022 6:42:20 PM
Why 50% on him? It's his fault 100%. Refund works during trial. Trial over - no refund. Right?
iv6o0lw
webscraping
RobSm
t1_iv6o0lw
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/iv6o0lw/
11/5/2022 6:42:20 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iv561zo
t1_iv561zo
iv561zo
1
yl2xt5
False
False
False
1
1
4
4
4
23.5294117647059
1
5.88235294117647
0
0
6
35.2941176470588
17
128, 128, 128
3
Solid
50
Yes
84
RepliedTo
11/5/2022 10:57:31 PM
I was being generous, but fair enough, you’re right
iv7nqgi
webscraping
Nabinator
t1_iv7nqgi
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/iv7nqgi/
11/5/2022 10:57:31 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iv6o0lw
t1_iv6o0lw
iv6o0lw
0
yl2xt5
False
False
False
2
1
4
4
4
40
0
0
0
0
2
20
10
128, 128, 128
3
Solid
50
No
86
Commented
11/5/2022 11:48:50 AM
You’re the nonce who forgot to cancel before the free trial ended. Don’t skip over that part, this is 50% on you.
iv561zo
webscraping
Nabinator
t1_iv561zo
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/iv561zo/
11/5/2022 11:48:50 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
yl2xt5
t3_yl2xt5
yl2xt5
2
yl2xt5
False
False
False
0
1
4
4
1
4.16666666666667
0
0
0
0
12
50
24
128, 128, 128
3
Solid
50
No
83
RepliedTo
1/11/2023 12:46:25 PM
How does this shitty software having a scam like bait and switch refund policy make him a nonce? They make it sound like you can easily get a refund if you are dissatisfied when signing up so that you hopefully forget to cancel, than slam you with a fee higher than the standard tier of service when you forget. If you did it on purpose you'd be entitled to a 5 day refund period, but if they got you to forget you don't? Bull shit. Chargeback via the bank, don't put up with this crap.
j3vvoj1
webscraping
OldNothing7241
t1_j3vvoj1
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/j3vvoj1/
1/11/2023 12:46:25 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
iv561zo
t1_iv561zo
iv561zo
0
yl2xt5
False
False
False
1
1
4
4
3
3.15789473684211
5
5.26315789473684
0
0
30
31.5789473684211
95
128, 128, 128
3
Solid
50
No
82
Commented
11/4/2022 9:09:33 AM
U might be able to find the old policy version in the web archive
iv06v98
webscraping
calson3asab
t1_iv06v98
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/iv06v98/
11/4/2022 9:09:33 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
yl2xt5
t3_yl2xt5
yl2xt5
0
yl2xt5
False
False
False
0
1
4
4
0
0
0
0
0
0
6
42.8571428571429
14
128, 128, 128
3
Solid
50
No
91
Posted
11/3/2022 1:30:57 PM
When I went for their free trial, they said in their refund policy they would give 100% refund as long as the user is dissatisfied with the product. And boy, was I bloody dissatisfied with it. Didn’t even use anything they scraped coz it was shit. Then in my midst of finding alternative solutions, I forgot to cancel the free trial and when I wrote in to request the refund because I didn’t use anything they scraped, they changed their bloody refund policy so in the end I couldn’t find the old refund policy that says they would refund as long as the user is dissatisfied. I’m pissed as fuck. Hope none of y’all had to go through this.
yl2xt5
webscraping
canigetahoyahhh
t3_yl2xt5
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/
11/3/2022 1:30:57 PM
1/1/0001 12:00:00 AM
False
False
13
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse is a scam
False
0.85
yl2xt5
0
1
4
4
8
6.45161290322581
7
5.64516129032258
0
0
40
32.258064516129
124
128, 128, 128
3
Solid
50
No
81
Commented
11/4/2022 3:18:13 AM
Works for me, I have scrapped over 500k records using octo
iuzecr9
webscraping
turbospeedsc
t1_iuzecr9
https://www.reddit.com/r/webscraping/comments/yl2xt5/octoparse_is_a_scam/iuzecr9/
11/4/2022 3:18:13 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
yl2xt5
t3_yl2xt5
yl2xt5
0
yl2xt5
False
False
False
0
1
4
4
1
9.09090909090909
0
0
0
0
6
54.5454545454545
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
238
Posted
10/13/2021 11:28:52 PM
I work at a car dealership. There is a certain car auction that our dealership uses all the time. One of my duties is to go through every car on our watch-list and look up the MMR value of the car. Then I need to look up the cars AUTOCHECK report and See if it is a clean title and if it had any accidents or damage. I have to go back and forth from the auction website and the MMR Calculator website and put in the cars Vin, Miles and Condition report Rating. Then I have to go to AUTOCHECK and Check the vehicles history. Then I have to put in the "notes" section of the Auction and put in the MMR amount and anything bad I find in AUTOCHECK.
Is there any way I can Pull all the data from the Auction and have it automatically put in the MMR Calculator and AUTOCHECK, and then Automatically put the MMR Value and AUTOCHECK history into the Auction Notes.
What I did so Far: I Figured the first step would be to pull all the data from the Auction. I found a program called Octoparse. The only fields I really need from the auction is the VIN number, Mileage and Condition Report score. So I used Octoparse to pull all the cars on the watchlist. The problem I ran into is that the VIN number is not located on the Watchlist. In order to find the Vin, I have to click the cars on the watchlist and then it will take me to the cars page wichh has the actual vin number. The notes(Which is what I need the MMR and Autockeck info to be in the end) is on that page as well. Once I can get the vin number, Mileage and Condition Report in the same spreadsheet, I will need to find a way to have all the cars automatically go through the MMR Calculator and Autocheck. Then The info needs to somehow be pasted in the notes for all the cars. I did some research and the only idea I could find is to find some sort of Bookmarklet to do it.
q7ndfx
webdev
theofficialjewses
t3_q7ndfx
https://www.reddit.com/r/webdev/comments/q7ndfx/how_to_auto_populate_notes_for_cars_on_an_online/
10/13/2021 11:28:52 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to auto populate notes for cars on an online auction
False
1
q7ndfx
0
4
33
33
3
0.817438692098093
3
0.817438692098093
0
0
151
41.1444141689373
367
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
237
Posted
10/13/2021 11:30:10 PM
I work at a car dealership. There is a certain car auction that our dealership uses all the time. One of my duties is to go through every car on our watch-list and look up the MMR value of the car. Then I need to look up the cars AUTOCHECK report and See if it is a clean title and if it had any accidents or damage. I have to go back and forth from the auction website and the MMR Calculator website and put in the cars Vin, Miles and Condition report Rating. Then I have to go to AUTOCHECK and Check the vehicles history. Then I have to put in the "notes" section of the Auction and put in the MMR amount and anything bad I find in AUTOCHECK.
Is there any way I can Pull all the data from the Auction and have it automatically put in the MMR Calculator and AUTOCHECK, and then Automatically put the MMR Value and AUTOCHECK history into the Auction Notes.
What I did so Far: I Figured the first step would be to pull all the data from the Auction. I found a program called Octoparse. The only fields I really need from the auction is the VIN number, Mileage and Condition Report score. So I used Octoparse to pull all the cars on the watchlist. The problem I ran into is that the VIN number is not located on the Watchlist. In order to find the Vin, I have to click the cars on the watchlist and then it will take me to the cars page wichh has the actual vin number. The notes(Which is what I need the MMR and Autockeck info to be in the end) is on that page as well. Once I can get the vin number, Mileage and Condition Report in the same spreadsheet, I will need to find a way to have all the cars automatically go through the MMR Calculator and Autocheck. Then The info needs to somehow be pasted in the notes for all the cars. I did some research and the only idea I could find is to find some sort of Bookmarklet to do it.
q7ne98
AskProgramming
theofficialjewses
t3_q7ne98
https://www.reddit.com/r/AskProgramming/comments/q7ne98/how_can_i_auto_populate_notes_on_an_online_auction/
10/13/2021 11:30:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How can I auto populate notes on an online auction
False
1
q7ne98
0
4
33
33
3
0.817438692098093
3
0.817438692098093
0
0
151
41.1444141689373
367
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
80
Commented
10/14/2021 1:57:33 AM
Do any of the sites in question have a public API? That will make this much, much, much simpler.
Otherwise you get to do a lot of webscraping which is tedious and error-prone.
hgk84zn
AskProgramming
KingofGamesYami
t1_hgk84zn
https://www.reddit.com/r/AskProgramming/comments/q7ne98/how_can_i_auto_populate_notes_on_an_online_auction/hgk84zn/
10/14/2021 1:57:33 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
q7ne98
t3_q7ne98
q7ne98
1
q7ne98
False
False
False
0
2
33
33
1
2.94117647058824
2
5.88235294117647
0
0
12
35.2941176470588
34
128, 128, 128
3
Solid
50
Yes
79
RepliedTo
10/14/2021 2:48:19 AM
What is An API
hgkedrj
AskProgramming
theofficialjewses
t1_hgkedrj
https://www.reddit.com/r/AskProgramming/comments/q7ne98/how_can_i_auto_populate_notes_on_an_online_auction/hgkedrj/
10/14/2021 2:48:19 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hgk84zn
t1_hgk84zn
hgk84zn
1
q7ne98
True
False
False
1
1
33
33
0
0
0
0
0
0
1
25
4
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
78
RepliedTo
10/14/2021 3:00:36 AM
Application Programming Interface;
Examples:
https://www.reddit.com/dev/api/
https://developers.facebook.com/docs/graph-api/
https://webservices.amazon.com/paapi5/documentation/
hgkfv48
AskProgramming
KingofGamesYami
t1_hgkfv48
https://www.reddit.com/r/AskProgramming/comments/q7ne98/how_can_i_auto_populate_notes_on_an_online_auction/hgkfv48/
10/14/2021 3:00:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hgkedrj
t1_hgkedrj
hgkedrj
0
q7ne98
False
False
False
2
2
33
33
0
0
0
0
0
0
4
100
4
128, 128, 128
3
Solid
50
No
76
Commented
12/24/2022 1:19:33 AM
What do you want to be? Data analyst, data scientist or a data engineer?
j1fzij6
data
alfarabi-logic
t1_j1fzij6
https://www.reddit.com/r/data/comments/zqa8v2/what_would_you_do/j1fzij6/
12/24/2022 1:19:33 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zqa8v2
t3_zqa8v2
zqa8v2
0
zqa8v2
False
False
False
0
1
36
36
0
0
0
0
0
0
6
42.8571428571429
14
128, 128, 128
3
Solid
50
No
75
Commented
12/20/2022 9:27:34 PM
Personally I learn better when I'm interested in something. And as someone who's been a hiring manager, I'm more interested in how you think about a problem, how you worked through it, how you decided what tools to use, what you learned, etc than "can you perform x technical task."
What's something you're interested in, and what's a question you can ask that can be answered with data? Google "open data portal \[your city\]" and see if you can find some public datasets. Write up your process of finding the data, understanding the structure, making some transformations before you even get to answering the question, etc.
To me the content and process stand out more than seeing another "imdb recommendation algorithm" or "spam detector" or other things that you see repeated in lots of portfolios.
j10x0e4
data
RenRidesCycles
t1_j10x0e4
https://www.reddit.com/r/data/comments/zqa8v2/what_would_you_do/j10x0e4/
12/20/2022 9:27:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zqa8v2
t3_zqa8v2
zqa8v2
0
zqa8v2
False
False
False
0
1
36
36
3
2.22222222222222
1
0.740740740740741
0
0
61
45.1851851851852
135
128, 128, 128
3
Solid
50
No
77
Posted
12/20/2022 1:30:49 AM
Hello, I'm a psychology student who hates his degree. I'm studying data analytics, and I can't really find an internship. What projects would you do in my position to showcase what you can do? (i'm happy to figure out things I don't know yet)
1. Atm I know some basic R
2. SQL queries
3. Tableau Visualization
4. Excel
5. Data Scraping (using a service like Octoparse)
zqa8v2
data
Ethanwii
t3_zqa8v2
https://www.reddit.com/r/data/comments/zqa8v2/what_would_you_do/
12/20/2022 1:30:49 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What Would You Do?
False
0.67
zqa8v2
0
1
36
36
2
2.98507462686567
1
1.49253731343284
0
0
28
41.7910447761194
67
128, 128, 128
3
Solid
50
No
74
Commented
12/20/2022 1:50:07 PM
Would suggest to learn Python (this one is debatable, but reason is it's much more versatile)
Look up [Kaggle](https://www.kaggle.com/), there's a tons of datasets/competitions there that you could use/analyze.
j0z0i99
data
Grouchy_Document7786
t1_j0z0i99
https://www.reddit.com/r/data/comments/zqa8v2/what_would_you_do/j0z0i99/
12/20/2022 1:50:07 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
zqa8v2
t3_zqa8v2
zqa8v2
0
zqa8v2
False
False
False
0
1
36
36
1
2.85714285714286
1
2.85714285714286
0
0
15
42.8571428571429
35
128, 128, 128
3
Solid
50
No
72
Commented
10/4/2018 8:05:59 AM
Should have a `TRUE RESET` button
e753bp1
softwaregore
Kotauskas
t1_e753bp1
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e753bp1/
10/4/2018 8:05:59 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9l1x4s
t3_9l1x4s
9l1x4s
0
9l1x4s
False
False
False
0
1
7
7
0
0
0
0
0
0
3
50
6
128, 128, 128
3
Solid
50
No
71
Commented
10/4/2018 12:03:56 AM
Is this a continue? Screen
e74ft0p
softwaregore
MrPopzicle-Supercard
t1_e74ft0p
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e74ft0p/
10/4/2018 12:03:56 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9l1x4s
t3_9l1x4s
9l1x4s
0
9l1x4s
False
False
False
0
1
7
7
0
0
0
0
0
0
2
40
5
128, 128, 128
3
Solid
50
No
69
Commented
10/3/2018 9:31:20 PM
*no-ing intensifies*
e745vvd
softwaregore
techgineer13
t1_e745vvd
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e745vvd/
10/3/2018 9:31:20 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9l1x4s
t3_9l1x4s
9l1x4s
0
9l1x4s
False
False
False
0
1
7
7
0
0
0
0
0
0
2
66.6666666666667
3
128, 128, 128
3
Solid
50
No
68
Commented
10/3/2018 8:42:52 PM
Task [REDACTED]
Has scp 079 breached containment again?
e742e2v
softwaregore
missile500
t1_e742e2v
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e742e2v/
10/3/2018 8:42:52 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9l1x4s
t3_9l1x4s
9l1x4s
0
9l1x4s
False
False
False
0
1
7
7
0
0
0
0
0
0
7
87.5
8
128, 128, 128
3
Solid
50
No
67
Commented
10/3/2018 8:05:59 PM
I misread “exist” as exit and I spent way longer than I should’ve trying to figure out the problem
e73zle1
softwaregore
matrixyst
t1_e73zle1
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e73zle1/
10/3/2018 8:05:59 PM
1/1/0001 12:00:00 AM
False
False
8
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9l1x4s
t3_9l1x4s
9l1x4s
0
9l1x4s
False
False
False
0
1
7
7
0
0
2
10
0
0
8
40
20
128, 128, 128
3
Solid
50
No
66
Commented
10/3/2018 5:44:37 PM
"No"
e73ozql
softwaregore
dinojl
t1_e73ozql
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e73ozql/
10/3/2018 5:44:37 PM
1/1/0001 12:00:00 AM
False
False
29
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9l1x4s
t3_9l1x4s
9l1x4s
1
9l1x4s
False
False
False
0
1
7
7
0
0
0
0
0
0
0
0
1
128, 128, 128
3
Solid
50
No
65
RepliedTo
10/3/2018 6:23:07 PM
r/suicidebywords
e73rsab
softwaregore
NoNameRequiredxD
t1_e73rsab
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e73rsab/
10/3/2018 6:23:07 PM
1/1/0001 12:00:00 AM
False
False
14
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
e73ozql
t1_e73ozql
e73ozql
1
9l1x4s
False
False
False
1
1
7
7
0
0
0
0
0
0
1
50
2
128, 128, 128
3
Solid
50
No
64
RepliedTo
10/3/2018 10:38:28 PM
Literally
e74afgo
softwaregore
fdf2002
t1_e74afgo
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e74afgo/
10/3/2018 10:38:28 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
e73rsab
t1_e73rsab
e73rsab
1
9l1x4s
False
False
False
2
1
7
7
0
0
0
0
0
0
1
100
1
128, 128, 128
3
Solid
50
No
70
Commented
10/3/2018 10:39:11 PM
r/engrish?
e74ah9g
softwaregore
fdf2002
t1_e74ah9g
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e74ah9g/
10/3/2018 10:39:11 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9l1x4s
t3_9l1x4s
9l1x4s
0
9l1x4s
False
False
False
0
1
7
7
0
0
0
0
0
0
1
50
2
128, 128, 128
3
Solid
50
No
63
RepliedTo
10/3/2018 11:16:27 PM
r/technicallythetruth
e74curn
softwaregore
hughjanus0
t1_e74curn
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e74curn/
10/3/2018 11:16:27 PM
1/1/0001 12:00:00 AM
False
False
4
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
e74afgo
t1_e74afgo
e74afgo
0
9l1x4s
False
False
False
3
1
7
7
0
0
0
0
0
0
1
50
2
128, 128, 128
3
Solid
50
No
62
Commented
10/3/2018 5:12:09 PM
Seems like a surreal meme
e73mltb
softwaregore
SSUPII
t1_e73mltb
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e73mltb/
10/3/2018 5:12:09 PM
1/1/0001 12:00:00 AM
False
False
11
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9l1x4s
t3_9l1x4s
9l1x4s
0
9l1x4s
False
False
False
0
1
7
7
1
20
0
0
0
0
2
40
5
128, 128, 128
3
Solid
50
No
73
Posted
10/3/2018 2:59:18 PM
https://i.redd.it/ykj2301xizp11.png
9l1x4s
softwaregore
Twiggy3
t3_9l1x4s
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/
10/3/2018 2:59:18 PM
1/1/0001 12:00:00 AM
False
False
181
4
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse gives me an interesting option.
False
0.98
9l1x4s
0
1
7
7
128, 128, 128
3
Solid
50
No
61
Commented
10/3/2018 3:05:49 PM
Having some existential crisis going on I see
e73djtc
softwaregore
MrPenguin9
t1_e73djtc
https://www.reddit.com/r/softwaregore/comments/9l1x4s/octoparse_gives_me_an_interesting_option/e73djtc/
10/3/2018 3:05:49 PM
1/1/0001 12:00:00 AM
False
False
19
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
9l1x4s
t3_9l1x4s
9l1x4s
0
9l1x4s
False
False
False
0
1
7
7
0
0
1
12.5
0
0
4
50
8
128, 128, 128
3
Solid
50
Yes
59
Commented
7/18/2021 11:49:02 PM
Like one of these [alternatives](https://alternativeto.net/software/octoparse/)?
h5oy4cy
datamining
SurlyNacho
t1_h5oy4cy
https://www.reddit.com/r/datamining/comments/om0p0o/does_anyone_know_of_a_visual_scraping_software/h5oy4cy/
7/18/2021 11:49:02 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
om0p0o
t3_om0p0o
om0p0o
1
om0p0o
False
False
False
0
1
47
47
0
0
0
0
0
0
6
60
10
128, 128, 128
3
Solid
50
Yes
58
RepliedTo
7/19/2021 4:41:33 AM
I might as well check out Scrapy, I have been using selenium but struggling to create working code. Ideally I am looking for a software that has a simple point and click front end while creating the actual code on the back end based upon actions taken on the front, and then finally printing the script for you to learn from and adjust as you see fit. Thanks for this list though!
h5puy84
datamining
WeederGate
t1_h5puy84
https://www.reddit.com/r/datamining/comments/om0p0o/does_anyone_know_of_a_visual_scraping_software/h5puy84/
7/19/2021 4:41:33 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h5oy4cy
t1_h5oy4cy
h5oy4cy
0
om0p0o
True
False
False
1
1
47
47
2
2.77777777777778
1
1.38888888888889
0
0
33
45.8333333333333
72
128, 128, 128
3
Solid
50
Yes
57
Commented
7/18/2021 4:42:09 AM
Kimono was a great tool before it was acquired by Palantir.
Basically helped you select elements on a page and then helped you convertthat to a scheduled scrape.
h5lgq23
datamining
Simius
t1_h5lgq23
https://www.reddit.com/r/datamining/comments/om0p0o/does_anyone_know_of_a_visual_scraping_software/h5lgq23/
7/18/2021 4:42:09 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
om0p0o
t3_om0p0o
om0p0o
1
om0p0o
False
False
False
0
1
47
47
3
10.7142857142857
0
0
0
0
11
39.2857142857143
28
128, 128, 128
3
Solid
50
Yes
56
RepliedTo
7/18/2021 10:57:40 AM
Hmmm, ok cool, thank you. Do you mind expanding on what has gone wrong with Kimono since being acquired by Palantir? That sounds like a great start. I'm surprised there aren't more programs like this, it is really useful for practical learners.
h5m9ag3
datamining
WeederGate
t1_h5m9ag3
https://www.reddit.com/r/datamining/comments/om0p0o/does_anyone_know_of_a_visual_scraping_software/h5m9ag3/
7/18/2021 10:57:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h5lgq23
t1_h5lgq23
h5lgq23
0
om0p0o
True
False
False
1
1
47
47
4
9.52380952380952
1
2.38095238095238
0
0
17
40.4761904761905
42
128, 128, 128
3
Solid
50
No
60
Posted
7/17/2021 8:55:59 AM
Hello everyone,
I'm new to scraping and coding related activity so hopefully the question is clear. I am looking for a visual scraping software similar to Octoparse, but it could also be a browser extension, that writes the script as I click on the front-end. Appreciate any insight you can give on this.
om0p0o
datamining
WeederGate
t3_om0p0o
https://www.reddit.com/r/datamining/comments/om0p0o/does_anyone_know_of_a_visual_scraping_software/
7/17/2021 8:55:59 AM
1/1/0001 12:00:00 AM
False
False
5
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Does anyone know of a visual scraping software that can also create script for you to use?
False
0.86
om0p0o
0
1
47
47
2
3.7037037037037
0
0
0
0
23
42.5925925925926
54
128, 128, 128
3
Solid
50
Yes
54
Commented
3/9/2020 12:54:34 PM
Hey does Octoparse - is a downloadable app?? Because i am using [https://e-scraper.com/shopify/](https://e-scraper.com/shopify/) (Saas) for my eCommerce projects...
fk0j5lv
SuggestALaptop
alex2440933
t1_fk0j5lv
https://www.reddit.com/r/SuggestALaptop/comments/f60jmn/sos_i_need_a_new_laptop_yesterday_i_have_no_idea/fk0j5lv/
3/9/2020 12:54:34 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
f60jmn
t3_f60jmn
f60jmn
1
f60jmn
False
False
False
0
1
24
24
0
0
0
0
0
0
12
46.1538461538462
26
128, 128, 128
3
Solid
50
Yes
53
RepliedTo
3/10/2020 3:43:36 AM
Yes it is downloadable
fk2xthp
SuggestALaptop
Sanbenito444
t1_fk2xthp
https://www.reddit.com/r/SuggestALaptop/comments/f60jmn/sos_i_need_a_new_laptop_yesterday_i_have_no_idea/fk2xthp/
3/10/2020 3:43:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fk0j5lv
t1_fk0j5lv
fk0j5lv
0
f60jmn
True
False
False
1
1
24
24
0
0
0
0
0
0
2
50
4
128, 128, 128
3
Solid
50
No
52
Commented
2/19/2020 4:11:50 PM
I would recommend this [Acer Aspire 5](https://www.amazon.com/dp/B07RF2123Z/?tag=bkadamos_alltest-20) because of the following:
* It offers great value for money since it comes with 8 GB RAM and 256GB SSD and both are somewhat standards in 600 range so finding them on 500~ range is somewhat a catch.
* Slim, sleek high-quality case and the aluminum lid gives the Aspire a premium look.
* You can actually rotate the lid back a little beyond 180 degrees, meaning you can lay the laptop completely flat with the lid open.
* It is lightweight for its size at 3.97 lbs
* Comes with backlit keyboard, you also get a dedicated numeric keypad with a somewhat narrow design.
* Battery life advertised Up to 7.5-hours Battery Life.
* Comes with a range of ports, including HDMI, USB 3.1 & USB 2.0.
* You can upgrade SSD and RAM up to 20GB according to this
https://www.crucial.com/usa/en/compatible-upgrade-for/Acer/aspire-a515-54-51dj
[Here is a more detailed review of the Acer Aspire 5](http://laptoplegend.com/2019/best-budget-laptop-under-500-in-2019/)
Make sure to check out [This Laptop Buying Guide post for best laptops](https://old.reddit.com/r/SuggestALaptop/comments/ebtg1r/gift_buying_guide_best_laptops_to_buy_for/)
fi3vt6f
SuggestALaptop
LonerIM2
t1_fi3vt6f
https://www.reddit.com/r/SuggestALaptop/comments/f60jmn/sos_i_need_a_new_laptop_yesterday_i_have_no_idea/fi3vt6f/
2/19/2020 4:11:50 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
f60jmn
t3_f60jmn
f60jmn
0
f60jmn
False
False
False
0
1
24
24
9
4.52261306532663
0
0
0
0
104
52.2613065326633
199
128, 128, 128
3
Solid
50
No
51
Commented
2/18/2020 11:17:43 PM
Check out the[ **ASUS VivoBook 15**](https://www.amazon.com/ASUS-VivoBook-Backlit-Keyboard-Bluetooth/dp/B07Y5PK8SC/ref=as_li_ss_tl?crid=20CLGW3HQ2ITJ&keywords=asus+vivobook+15&qid=1579741675&sprefix=asus+vivobook,aps,265&sr=8-4&linkCode=ll1&tag=laptop04f1-20&linkId=35ccbf581cd9c51f7d76ee666c7656e1&language=en_US) it is a great laptop for under $500. It is lightweight, reliable for the price and has decent specs
* 15.6 Inch FHD 1080P Laptop
* AMD Ryzen 3 3200U up to 3.5GHz,
* 16GB DDR4 RAM,
* 256GB SSD,
16 GB of RAM will be more than enough for all programs you plan on running so no need to worry about upgrading it or the SSD anytime soon
fi1z5r5
SuggestALaptop
elvinelmo
t1_fi1z5r5
https://www.reddit.com/r/SuggestALaptop/comments/f60jmn/sos_i_need_a_new_laptop_yesterday_i_have_no_idea/fi1z5r5/
2/18/2020 11:17:43 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
f60jmn
t3_f60jmn
f60jmn
0
f60jmn
False
False
False
0
1
24
24
4
3.6036036036036
1
0.900900900900901
0
0
69
62.1621621621622
111
128, 128, 128
3
Solid
50
Yes
50
Commented
2/18/2020 11:00:46 PM
Hi, I think that the [**Lenovo IdeaPad 330S**](https://goto.walmart.com/c/1883484/565706/9383?veh=aff&sourceid=imp_000011112222333344&u=https%3A%2F%2Fwww.walmart.com%2Fip%2FLenovo-Ideapad-330s-15-6-Laptop-Windows-10-AMD-Ryzen-5-2500U-Quad-Core-Processor-8GB-Memory-256GB-Storage-Platinum-Grey-81FB00HKUS%2F273186587) is the best budget device if you want to be able to save some of that money and contribute it to a more long term device in the future. For $359, it has all of the basic specifications needed: 4-core Ryzen 5 processor, 8 GB RAM, a 256 GB SSD, and a 1080p 15.6" display. This combination should handle the uses you listed just fine, they don't seem to be very demanding.
fi1xjk9
SuggestALaptop
legos45
t1_fi1xjk9
https://www.reddit.com/r/SuggestALaptop/comments/f60jmn/sos_i_need_a_new_laptop_yesterday_i_have_no_idea/fi1xjk9/
2/18/2020 11:00:46 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
f60jmn
t3_f60jmn
f60jmn
1
f60jmn
False
False
False
0
1
24
24
2
1.6
0
0
0
0
72
57.6
125
128, 128, 128
3
Solid
50
Yes
49
RepliedTo
2/18/2020 11:16:32 PM
I actually think this is a great option and I found one 5 blocks away for $299! Thank you so much!
fi1z1th
SuggestALaptop
Sanbenito444
t1_fi1z1th
https://www.reddit.com/r/SuggestALaptop/comments/f60jmn/sos_i_need_a_new_laptop_yesterday_i_have_no_idea/fi1z1th/
2/18/2020 11:16:32 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
fi1xjk9
t1_fi1xjk9
fi1xjk9
0
f60jmn
True
False
False
1
1
24
24
2
9.52380952380952
0
0
0
0
9
42.8571428571429
21
128, 128, 128
3
Solid
50
No
55
Posted
2/18/2020 10:56:12 PM
\* \*\*Total budget and country of purchase:\*\*
United States: Texas. Under $500
\* \*\*Do you prefer a 2 in 1 form factor, good battery life or best specifications for the money? Pick or include any that apply.\*\*
speed and efficiency is most important followed by price.
\* \*\*How important is weight and thinness to you?\*\*
This is medium important. I am a tiny adult human so ideally the thinner and lighter the better. That said, I definitely want a medium-large screen (15 inches?) and since I need to buy a pretty cheap computer so as long as it's relatively easy to lug around I won't be too picky.
\* \*\*Which OS do you require? Windows, Mac, Chrome OS, Linux.\*\*
Windows preferably, but am not opposed to just Chrome OS either. I don't know anything about Linux and I don't want a mac.
\* \*\*Do you have a preferred screen size? If indifferent, put N/A.\*\*
bigger than 12" which is what I have now. I am thinking 15-ish.
\* \*\*Are you doing any CAD/video editing/photo editing/gaming? List which programs/games you desire to run.\*\*
I will use it to surf the net, sourcing, research etc. I want to run a couple of data mining and web scraping tools and do some light website and digital marketing design, most of these are online tools. The only downloaded programs I use are Octoparse and Office 365. I will not be using this computer for gaming, downloading music/videos, streaming a ton of media, and I don't care for fills and extra things I won't be using.
\* \*\*If you're gaming, do you have certain games you want to play? At what settings and FPS do you want?\*\*
No gaming.
\* \*\*Any specific requirements such as good keyboard, reliable build quality, touch-screen, finger-print reader, optical drive or good input devices (keyboard/touchpad)?\*\*
I very much need to be able to use a mouse with the computer. Not a touchpad. I need a minimum of 6G RAM.
\* \*\*Leave any finishing thoughts here that you may feel are necessary and beneficial to the discussion.\*\*
I would prefer not to have useless bells and whistles. I just need to get on the internet, have a ton of tabs open, use chrome, and do a bit of design and have the computer work properly and move swiftly between tasks/tabs/windows/browsers/functions/programs. I need something easy to use that's super fast. I don't care if I have to buy a new one in 2 years. This is for the present and immediate future as I know that my price point isn't going to buy something truly fantastic or even very reliable. Is what it is.
f60jmn
SuggestALaptop
Sanbenito444
t3_f60jmn
https://www.reddit.com/r/SuggestALaptop/comments/f60jmn/sos_i_need_a_new_laptop_yesterday_i_have_no_idea/
2/18/2020 10:56:12 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
SOS- I Need A New Laptop Yesterday! I have no idea, so make me some recommendations so I can purchase ASAP:
False
1
f60jmn
0
1
24
24
25
5.56792873051225
4
0.89086859688196
0
0
197
43.8752783964365
449
128, 128, 128
3
Solid
50
No
48
Posted
8/23/2017 3:11:25 PM
[removed]
6vjlb8
a:t5_2zg0i
perpjattithero
t3_6vjlb8
https://www.reddit.com/r/a:t5_2zg0i/comments/6vjlb8/octoparse_642/
8/23/2017 3:11:25 PM
11/2/2017 7:33:22 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse 6.4.2
False
1
6vjlb8
0
1
1
1
0
0
0
0
0
0
1
100
1
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
No
540
Posted
8/27/2019 9:22:15 AM
Hello Folks, I think you all agree with me how powerful web scraping can be as it extracts the data online and saves to structured format for analysis access. Inspired by the idea of data extraction, I think it is a good idea to start content curation with web scraping. Content Curation is a very popular business model on the internet, and it is possible to make money via affiliate marketing, product promotion, advertising. This is a step by step tutorial about [how to scrape news articles from News media](https://www.youtube.com/watch?v=VOoPev_GzUM&t=153s). We can start from there, and extend to scrape other social media platforms to collect niche subjects.
I also write an article about [content curation](https://www.octoparse.com/blog/how-web-scraping-for-content-curation-works). Thanks for web scraping tool, which automates the extraction without tech skills. Please leave comments, I am inspired to share more information.
cw1xhg
datagangsta
Octoparse
t3_cw1xhg
https://www.reddit.com/r/datagangsta/comments/cw1xhg/data_scraping_101_with_web_scraping_tool_without/
8/27/2019 9:22:15 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Data Scraping 101 with Web Scraping Tool without coding
False
0.5
cw1xhg
0
9
12
12
4
2.51572327044025
0
0
0
0
86
54.0880503144654
159
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
No
539
Posted
5/20/2019 8:28:39 AM
**The goal of this workshop is to use a website scraper to read and pull tweets about Donald Trump. Then we will use a combination of text mining and visualization techniques to analyze the public voice about Donald Trump.** There is nothing fancy. It's just a practice of using python. It is difficult at the beginning, but once you practice more, you will get tricks. Everything becomes so easy.
For detail explanation of the process, you can visit [here.](https://www.octoparse.com/blog/text-mining-scraping-and-sentiment-analysis-using-python) For a complete version of the code, you can download here ([https://gist.github.com/octoparse/fd9e0006794754edfbdaea86de5b1a51](https://gist.github.com/octoparse/fd9e0006794754edfbdaea86de5b1a51))
Step1, I scraped 10K tweets using [**Octoparse**](https://www.octoparse.com/) since it's a fully free web scraping tool. And exporting the data into txt format.
Step2, load opinion words list using Notepad++ , and preprocessed extracted tweets by taking out the punctuations, signs, and numbers.
Step3, **take each opinion word from the lists, return to the tweets, and count the frequency of each opinion words in the tweets. As a result, we collect corresponding opinion words in the tweets and the count.**
Step4, export the results into Excel/CSV.
Step5, load the result using Tableau Public and choose the graph template you like to visualize the data.
&#x200B;
[Scraping tweets using Octoparse](https://preview.redd.it/evklepiktbz21.png?width=1200&format=png&auto=webp&v=enabled&s=90123aca625f9fd30f0566e145a6d5379c7b00fd)
&#x200B;
&#x200B;
[Word used in Twitter](https://preview.redd.it/g6qovulntbz21.png?width=1155&format=png&auto=webp&v=enabled&s=9dfe7351052565c8959bdfa499db42209e9717e9)
&#x200B;
[Positive Words and its frequency](https://preview.redd.it/p1qwalhytbz21.png?width=932&format=png&auto=webp&v=enabled&s=56bf6ee2654b729f0a70a6c2ffd51236eddb94e0)
&#x200B;
[Positive Words and its frequency](https://preview.redd.it/yxknwxqptbz21.png?width=921&format=png&auto=webp&v=enabled&s=3175263394e3c2e23291f3ee2cd6b5b03108655c)
bqsvjl
Python
Octoparse
t3_bqsvjl
https://www.reddit.com/r/Python/comments/bqsvjl/5_steps_text_mining_and_sentiment_analysis_using/
5/20/2019 8:28:39 AM
1/1/0001 12:00:00 AM
False
False
3
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
5 steps text mining and Sentiment Analysis using Python
False
0.64
bqsvjl
0
9
12
12
6
1.81818181818182
1
0.303030303030303
0
0
186
56.3636363636364
330
128, 128, 128
3.00756859035005
Dash Dot Dot
49.9675631842141
No
538
Posted
8/26/2019 10:00:44 AM
[removed]
cvlysq
Python
Octoparse
t3_cvlysq
https://www.reddit.com/r/Python/comments/cvlysq/is_web_scraping_legal/
8/26/2019 10:00:44 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Is Web Scraping Legal?
False
0.18
cvlysq
0
9
12
12
0
0
0
0
0
0
1
100
1
128, 128, 128
3
Solid
50
No
47
Commented
8/26/2019 12:01:16 PM
Hello! I'm a bot!
It looks to me like your post might be better suited for r/learnpython,
a sub geared towards questions and learning more about python.
That said, I am a bot and it is hard to tell.
Please follow the subs rules and guidelines when you do post there, it'll help you get better answers faster.
Show /r/learnpython the code you have tried and describe where you are stuck.
**[Be sure to format your code for reddit](https://www.reddit.com/r/learnpython/wiki/faq#wiki_how_do_i_format_code.3F)**
and include which version of python and what OS you are using.
You can also ask this question in the [Python discord](https://discord.gg/3Abzge7),
a large, friendly community focused around the Python programming language, open to those who wish to learn the language
or improve their skills, as well as those looking to help others.
***
[^(README)](https://github.com/CrakeNotSnowman/redditPythonHelper)
^(|)
[^(FAQ)](https://github.com/CrakeNotSnowman/redditPythonHelper/blob/master/FAQ.md)
^(|)
^(this bot is written and managed by /u/IAmKindOfCreative)
^(This bot is currently under development and experiencing changes to improve its usefulness)
ey53v1z
Python
pythonHelperBot
t1_ey53v1z
https://www.reddit.com/r/Python/comments/cvlysq/is_web_scraping_legal/ey53v1z/
8/26/2019 12:01:16 PM
1/1/0001 12:00:00 AM
False
False
-2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
cvlysq
t3_cvlysq
cvlysq
0
cvlysq
False
False
False
0
1
12
12
8
4.3010752688172
4
2.1505376344086
0
0
86
46.2365591397849
186
128, 128, 128
3
Solid
50
No
46
Posted
10/20/2022 12:58:48 AM
Hello friends,
Like many others, I am quite inexperienced with webscraping as well as Python. In the past, I've used Octoparse to get data for projects, but now that I'm a big boy and am learning Python I figure it's best to explore other options.
I am currently trying to get Yelp data for an academic project, but am hesitant to dive right into scraping, as it seems Yelp is particularly hostile to scrapers. The data I'm interested in isn't part of the public dataset, and while they invite those with academic interests to contact them, I'm not very optimistic about getting the data from them any time soon.
If anyone has any suggestions for methods of obtaining data from Yelp, please let me know--there seem to be quite a few third party tutorials and APIs, but, as a student, I'm hesitant to fork out money for something that may or may not work.
Thanks!
y8jsms
webscraping
National_Tart6235
t3_y8jsms
https://www.reddit.com/r/webscraping/comments/y8jsms/options_for_yelp_data/
10/20/2022 12:58:48 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Options for yelp data?
False
1
y8jsms
0
1
72
72
6
3.84615384615385
4
2.56410256410256
0
0
65
41.6666666666667
156
128, 128, 128
3
Solid
50
No
45
Commented
3/22/2023 5:53:05 AM
If you’re still looking : https://www.yelp.com/dataset
jd6ovug
webscraping
Altruistic_Olives
t1_jd6ovug
https://www.reddit.com/r/webscraping/comments/y8jsms/options_for_yelp_data/jd6ovug/
3/22/2023 5:53:05 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
y8jsms
t3_y8jsms
y8jsms
0
y8jsms
False
False
False
0
1
72
72
0
0
0
0
0
0
3
60
5
128, 128, 128
3
Solid
50
No
44
Posted
1/5/2022 4:10:38 AM
I have a stream and I wanted a live-ish counter for my kills/stats. It only updates after each match so this thing only needs to check every 10-ish minutes.
Problem is, I have no idea how to extract the info from the page on a regular basis. I tried octoparse, but I don't know how to get that set up to repeat. I feel in over my head.
rwdqb2
webscraping
Fix_Riven
t3_rwdqb2
https://www.reddit.com/r/webscraping/comments/rwdqb2/i_need_to_regularly_update_two_stats_from_one/
1/5/2022 4:10:38 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
I need to regularly update two stats from one site for my stream, but I have no idea what I'm doing
False
1
rwdqb2
0
1
10
10
0
0
2
2.8169014084507
0
0
28
39.4366197183099
71
128, 128, 128
3
Solid
50
No
43
Commented
1/5/2022 9:18:01 AM
Assume from your name you're scraping op.gg or something similar? Easiest is to need to have a script that runs every few min, via Windows Task Scheduler. I'll help you with the script if you tell me the site you are scraping, then you can run it on your end during your stream
hrc0szv
webscraping
bushcat69
t1_hrc0szv
https://www.reddit.com/r/webscraping/comments/rwdqb2/i_need_to_regularly_update_two_stats_from_one/hrc0szv/
1/5/2022 9:18:01 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
rwdqb2
t3_rwdqb2
rwdqb2
0
rwdqb2
False
False
False
0
1
10
10
1
1.85185185185185
0
0
0
0
23
42.5925925925926
54
128, 128, 128
3
Solid
50
No
42
Posted
2/24/2022 10:06:42 PM
[removed]
t0mt1o
u_medkhalilbenahmed
medkhalilbenahmed
t3_t0mt1o
https://www.reddit.com/r/u_medkhalilbenahmed/comments/t0mt1o/is_there_any_way_to_build_a_generic_scraper_in_a/
2/24/2022 10:06:42 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
is there any way to build a generic scraper in a web app like portia / octoparse
False
1
t0mt1o
0
1
1
1
0
0
0
0
0
0
1
100
1
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
40
Commented
3/8/2022 2:14:31 PM
You can contact their support to get help
hzu7fyj
webscraping
awebscrapingguy
t1_hzu7fyj
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/hzu7fyj/
3/8/2022 2:14:31 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
t9docb
t3_t9docb
t9docb
1
t9docb
False
False
False
0
2
6
6
1
12.5
0
0
0
0
2
25
8
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
39
RepliedTo
3/8/2022 2:21:48 PM
Even if I am not a paid user?
hzu8ei0
webscraping
ibroughtashrubbery
t1_hzu8ei0
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/hzu8ei0/
3/8/2022 2:21:48 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hzu7fyj
t1_hzu7fyj
hzu7fyj
1
t9docb
True
False
False
1
4
6
6
0
0
0
0
0
0
2
25
8
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
38
RepliedTo
3/8/2022 2:39:18 PM
Yes, you could even try with StackOverflow which is probably more appropriate to get helpTo be transparent with you they have massively spammed here, generate fake activities across many accounts with fake questions/answers and they still bother with some recurring spam. Weirdly when ppl come to ask questions / helps they disappear
Secondly, your post only provides the result of the error but has 0 info/context to explain. So I don't know what you expect from a such post but nobody will provide you with relevant pointers to your issue.
In my opinion (and I'm probably biased due to moderation) but in my experience, their product is probably one of the worst on the market and I don't know anyone/any companies running with this service
hzuarni
webscraping
awebscrapingguy
t1_hzuarni
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/hzuarni/
3/8/2022 2:39:18 PM
3/8/2022 3:00:43 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hzu8ei0
t1_hzu8ei0
hzu8ei0
1
t9docb
False
False
False
2
2
6
6
2
1.5748031496063
8
6.2992125984252
0
0
50
39.3700787401575
127
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
37
RepliedTo
3/8/2022 3:10:06 PM
Thanks...I'd be happy to try any other scraping tool as long as it allows a free run from my own computer. I don't need/want paid services and the query ain't that big. I tried Octoparse because of this, but I'd be happy to consider any other one if they offer any free limited plan.
Sorry I wasn't clear on the error. As a noob I'm not sure what information might be relevant. In my thinking, I assumed any Octoparse user could easily replicate the issue at hand by just loading any search from that site onto the scraper.
hzuf4w1
webscraping
ibroughtashrubbery
t1_hzuf4w1
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/hzuf4w1/
3/8/2022 3:10:06 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hzuarni
t1_hzuarni
hzuarni
0
t9docb
True
False
False
3
4
6
6
5
5
4
4
0
0
35
35
100
128, 128, 128
3
Solid
50
No
41
Posted
3/8/2022 10:11:09 AM
[removed]
t9docb
webscraping
ibroughtashrubbery
t3_t9docb
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/
3/8/2022 10:11:09 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
[noob]Octoparse stuck at 10 results while checking yp.com.hk...Why?
False
1
t9docb
0
1
6
6
0
0
0
0
0
0
1
100
1
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
36
Commented
3/8/2022 2:03:59 PM
octoparse is buggy as hell
hzu63kx
webscraping
Mr_Nice_
t1_hzu63kx
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/hzu63kx/
3/8/2022 2:03:59 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
t9docb
t3_t9docb
t9docb
1
t9docb
False
False
False
0
5
6
6
0
0
2
40
0
0
1
20
5
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
35
RepliedTo
3/8/2022 2:05:43 PM
Any alternative that allows a free scrap to be run locally?
hzu6bc5
webscraping
ibroughtashrubbery
t1_hzu6bc5
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/hzu6bc5/
3/8/2022 2:05:43 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hzu63kx
t1_hzu63kx
hzu63kx
1
t9docb
True
False
False
1
4
6
6
1
9.09090909090909
1
9.09090909090909
0
0
4
36.3636363636364
11
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
34
RepliedTo
3/8/2022 4:53:11 PM
I use selenium but you will have to write a script to tell it what to do.
hzuutb2
webscraping
Mr_Nice_
t1_hzuutb2
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/hzuutb2/
3/8/2022 4:53:11 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hzu6bc5
t1_hzu6bc5
hzu6bc5
1
t9docb
False
False
False
2
5
6
6
0
0
0
0
0
0
5
29.4117647058824
17
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
33
RepliedTo
3/9/2022 3:48:26 AM
Thanks...I guess I'll need to check some examples. Nothing useful on the point and click area, right?
hzxhqpx
webscraping
ibroughtashrubbery
t1_hzxhqpx
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/hzxhqpx/
3/9/2022 3:48:26 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hzuutb2
t1_hzuutb2
hzuutb2
1
t9docb
True
False
False
3
4
6
6
2
11.1111111111111
0
0
0
0
9
50
18
128, 128, 128
3.00378429517502
Solid
49.983781592107
Yes
32
RepliedTo
3/9/2022 2:50:18 PM
It depends what you are scraping. There are several free chrome plugins that might work well for you. I tried several paid services and found they all couldn't manage the bot detection I am up against very well so I had to code a custom solution.
hzz754q
webscraping
Mr_Nice_
t1_hzz754q
https://www.reddit.com/r/webscraping/comments/t9docb/nooboctoparse_stuck_at_10_results_while_checking/hzz754q/
3/9/2022 2:50:18 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hzxhqpx
t1_hzxhqpx
hzxhqpx
0
t9docb
False
False
False
4
5
6
6
4
8.69565217391304
0
0
0
0
18
39.1304347826087
46
128, 128, 128
3
Solid
50
No
31
Commented
11/16/2022 6:58:51 AM
This is a good free tool for getting your 1st degree connections data including email https://sites.google.com/view/linkedin-email-finder/home
iwkbl8l
Octoparse_ideas
Old-Medicine2826
t1_iwkbl8l
https://www.reddit.com/r/Octoparse_ideas/comments/wx8ujr/best_linkedin_job_scraper_to_extract_job_postings/iwkbl8l/
11/16/2022 6:58:51 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
wx8ujr
t3_wx8ujr
wx8ujr
0
wx8ujr
False
False
False
0
1
3
3
2
13.3333333333333
0
0
0
0
8
53.3333333333333
15
128, 128, 128
3
Solid
50
Yes
29
Commented
1/16/2022 9:24:37 PM
I know you've solved the problem already but if you want a programmatic way this code gets you the raw data behind the table:
import requests
import json
s = requests.Session()
headers = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Encoding':'gzip, deflate, br',
'Accept-Language':'en-ZA,en;q=0.9',
'Cache-Control':'no-cache',
'Connection':'keep-alive',
'Host':'airtable.com',
'Pragma':'no-cache',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36'
}
url = 'https://airtable.com/shrtlnin6RFvKPU9A/tbl2MH2ztc72AKFFL'
step = s.get(url,headers=headers)
#get data table url
start = 'urlWithParams: '
end = 'earlyPrefetchSpan:'
x = step.text
new_url = 'https://airtable.com'+ x[x.find(start)+len(start):x.rfind(end)].strip().replace('u002F','').replace('"','').replace('\\','/')[:-1] #get the token out the html
#get airtable auth
start = 'var headers = '
end = "headers['x-time-zone'] "
dirty_auth_json = x[x.find(start)+len(start):x.rfind(end)].strip()[:-1] #get the token out the html
auth_json = json.loads(dirty_auth_json)
new_headers = {
'Accept':'*/*',
'Accept-Encoding':'gzip, deflate, br',
'Accept-Language':'en-ZA,en;q=0.9',
'Cache-Control':'no-cache',
'Connection':'keep-alive',
'Host':'airtable.com',
'ot-tracer-sampled':'true',
'ot-tracer-spanid':auth_json['ot-tracer-spanid'],
'ot-tracer-traceid':auth_json['ot-tracer-traceid'],
'Pragma':'no-cache',
'sec-ch-ua':'" Not;A Brand";v="99", "Google Chrome";v="97", "Chromium";v="97"',
'sec-ch-ua-mobile':'?0',
'sec-ch-ua-platform':'"Windows"',
'Sec-Fetch-Dest':'empty',
'Sec-Fetch-Mode':'cors',
'Sec-Fetch-Site':'same-origin',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36',
'x-airtable-application-id':auth_json['x-airtable-application-id'],
'x-airtable-inter-service-client':'webClient',
'x-airtable-page-load-id':auth_json['x-airtable-page-load-id'],
'x-early-prefetch':'true',
'X-Requested-With':'XMLHttpRequest',
'x-time-zone':'Europe/London',
'x-user-locale':'en-ZA'
}
final = s.get(new_url,headers=new_headers).json()
print(final)
hsy9z1o
webscraping
bushcat69
t1_hsy9z1o
https://www.reddit.com/r/webscraping/comments/s4uq38/scraping_an_airtable_shared_view/hsy9z1o/
1/16/2022 9:24:37 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
s4uq38
t3_s4uq38
s4uq38
1
s4uq38
False
False
False
0
1
10
10
0
0
1
0.268817204301075
0
0
309
83.0645161290323
372
128, 128, 128
3
Solid
50
Yes
28
RepliedTo
1/16/2022 9:42:32 PM
I will test this week ! Thank you !
I do look for a way to keep my data updated and a script would help ! Thanks again
hsycp7g
webscraping
joachimbrnd
t1_hsycp7g
https://www.reddit.com/r/webscraping/comments/s4uq38/scraping_an_airtable_shared_view/hsycp7g/
1/16/2022 9:42:32 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hsy9z1o
t1_hsy9z1o
hsy9z1o
0
s4uq38
True
False
False
1
1
10
10
1
4
0
0
0
0
11
44
25
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
27
Commented
1/15/2022 10:09:07 PM
have you looked into this:
[https://medium.com/labtek-indie/scraping-airtable-part-2-cd160188abd4](https://medium.com/labtek-indie/scraping-airtable-part-2-cd160188abd4)
or this:
[https://colab.research.google.com/github/banditelol/airscraper/blob/master/notebook/Airtable%20Scraping%20CSV.ipynb#scrollTo=OQKWXl5kwzH5](https://colab.research.google.com/github/banditelol/airscraper/blob/master/notebook/Airtable%20Scraping%20CSV.ipynb#scrollTo=OQKWXl5kwzH5)
I'm not sure if you need to log in. When I used the Google Colab file, it looks like it requires some authentication.
hstkke6
webscraping
sudodoyou
t1_hstkke6
https://www.reddit.com/r/webscraping/comments/s4uq38/scraping_an_airtable_shared_view/hstkke6/
1/15/2022 10:09:07 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
s4uq38
t3_s4uq38
s4uq38
1
s4uq38
False
False
False
0
2
10
10
2
2.38095238095238
0
0
0
0
53
63.0952380952381
84
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
26
RepliedTo
1/16/2022 2:38:51 PM
Hey,
Just this morning, after three Fiverr failures (😓), I managed to get all the data using this : https://chrome.google.com/webstore/detail/airtable-data-extractor/hecfmeibolfopfblloblipiebnofllac?hl=fa
Thank you for your answer, I had indeed seen the first link!
hswjyk6
webscraping
joachimbrnd
t1_hswjyk6
https://www.reddit.com/r/webscraping/comments/s4uq38/scraping_an_airtable_shared_view/hswjyk6/
1/16/2022 2:38:51 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hstkke6
t1_hstkke6
hstkke6
1
s4uq38
True
False
False
1
4
10
10
1
3.44827586206897
1
3.44827586206897
0
0
12
41.3793103448276
29
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
25
RepliedTo
1/16/2022 8:39:50 PM
Fiverr can really be hit or miss. Nice to hear you found a solution. Never even thought of looking for a chrome extension!
hsy30sk
webscraping
sudodoyou
t1_hsy30sk
https://www.reddit.com/r/webscraping/comments/s4uq38/scraping_an_airtable_shared_view/hsy30sk/
1/16/2022 8:39:50 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hswjyk6
t1_hswjyk6
hswjyk6
1
s4uq38
False
False
False
2
2
10
10
1
4.34782608695652
1
4.34782608695652
0
0
11
47.8260869565217
23
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
24
RepliedTo
1/16/2022 9:45:27 PM
Yeah, for me it's mainly been missed and missed. People who say "yes sir", "I'm sure sir", and end up making me lose a day until they figure out that no, they can't. 😅
That and people who brand a price and ask for ten times that as custom offer...
I have paid hundreds of euros on Fiverr and I think only a fifth of that was worth it.
hsyd4z4
webscraping
joachimbrnd
t1_hsyd4z4
https://www.reddit.com/r/webscraping/comments/s4uq38/scraping_an_airtable_shared_view/hsyd4z4/
1/16/2022 9:45:27 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hsy30sk
t1_hsy30sk
hsy30sk
0
s4uq38
True
False
False
3
4
10
10
1
1.47058823529412
3
4.41176470588235
0
0
26
38.2352941176471
68
128, 128, 128
3
Solid
50
No
30
Posted
1/15/2022 9:26:28 PM
Hello everyone,
I was wondering if anyone had ever tried to scrape an airtable frame like [here](https://airtable.com/shrtlnin6RFvKPU9A/tbl2MH2ztc72AKFFL?backgroundColor=red&viewControls=on).
I was using Octoparse (I'm not a developper), than I tried the Airscraper repo on GitHub, to no avail. Finally I paid a guy 25 bucks on Fiverr and he failed and asked to refund me...
Is such a page possible to scrape?
Thanks a lot :)
s4uq38
webscraping
joachimbrnd
t3_s4uq38
https://www.reddit.com/r/webscraping/comments/s4uq38/scraping_an_airtable_shared_view/
1/15/2022 9:26:28 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping an airtable shared view
False
1
s4uq38
0
1
10
10
1
1.36986301369863
1
1.36986301369863
0
0
34
46.5753424657534
73
128, 128, 128
3
Solid
50
No
23
Posted
5/15/2022 5:42:17 PM
I'd like to get notified when there are new listings with a phone number which starts with certain numbers in this web:
[https://www.idealista.com/venta-viviendas/alicante-alacant-alicante/?ordenado-por=fecha-publicacion-desc](https://www.idealista.com/venta-viviendas/alicante-alacant-alicante/?ordenado-por=fecha-publicacion-desc)
Is there any easy way with some service or program like octoparse or parsehub?
Edit: If I can scrap all the phone numbers and send them to google sheets, I can set Zapier to send me an email when the number matches my conditions.
uqb8su
webscraping
maikelnait
t3_uqb8su
https://www.reddit.com/r/webscraping/comments/uqb8su/any_easy_way_to_get_alerts_for_certain_new/
5/15/2022 5:42:17 PM
5/15/2022 6:08:43 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Any easy way to get alerts for certain new listings in this web?
False
1
uqb8su
0
1
5
5
1
1.06382978723404
1
1.06382978723404
0
0
46
48.936170212766
94
128, 128, 128
3
Solid
50
No
22
Commented
5/15/2022 6:08:34 PM
https://duckduckgo.com/?q=web+page+change+monitoring&ia=web
i8q0fwf
webscraping
mdaniel
t1_i8q0fwf
https://www.reddit.com/r/webscraping/comments/uqb8su/any_easy_way_to_get_alerts_for_certain_new/i8q0fwf/
5/15/2022 6:08:34 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
uqb8su
t3_uqb8su
uqb8su
0
uqb8su
False
False
False
0
1
5
5
0
0
0
0
0
0
0
0
0
128, 128, 128
3
Solid
50
Yes
21
Commented
5/17/2021 6:51:45 PM
It's stored as the number of milliseconds since unix epoch time, which is very common when dealing with dates on a computer. It uses milliseconds, so you have to convert that into a human readable date like in [this stackoverflow question](https://stackoverflow.com/questions/48484020/convert-epoch-time-to-readable-time-in-excel).
gyh7pat
webscraping
firwolf
t1_gyh7pat
https://www.reddit.com/r/webscraping/comments/neo5t3/making_sense_of_octoparse_timestamps/gyh7pat/
5/17/2021 6:51:45 PM
5/17/2021 6:56:30 PM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
neo5t3
t3_neo5t3
neo5t3
1
neo5t3
False
False
False
0
1
2
2
3
5.55555555555556
0
0
0
0
25
46.2962962962963
54
128, 128, 128
3
Solid
50
Yes
20
RepliedTo
5/17/2021 7:35:43 PM
Perfect - thanks for the help!
gyhdx6z
webscraping
Rectangulau
t1_gyhdx6z
https://www.reddit.com/r/webscraping/comments/neo5t3/making_sense_of_octoparse_timestamps/gyhdx6z/
5/17/2021 7:35:43 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gyh7pat
t1_gyh7pat
gyh7pat
0
neo5t3
True
False
False
1
1
2
2
1
20
0
0
0
0
2
40
5
128, 128, 128
3.00378429517502
Solid
49.983781592107
No
435
Posted
5/17/2021 6:41:55 PM
Hi,
I've drawn some tweets using Octoparse's standard template.
I however am not sure of how to make sense of the indicated timestamps of the tweets after exported to excel. They are merely listed in indecipherable numbers (eg. 1620945297000, 1620945297000...)
Does anyone have insights into how i can make this useful?
neo7tb
learnprogramming
Rectangulau
t3_neo7tb
https://www.reddit.com/r/learnprogramming/comments/neo7tb/making_sense_of_octoparses_timestamps/
5/17/2021 6:41:55 PM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Making sense of Octoparse's timestamps
False
0.67
neo7tb
0
5
2
2
2
3.92156862745098
0
0
0
0
24
47.0588235294118
51
128, 128, 128
3.00378429517502
Solid
49.983781592107
No
434
Commented
5/17/2021 7:38:58 PM
That makes perfect sense.
Thank you for the help!
gyhedy4
learnprogramming
Rectangulau
t1_gyhedy4
https://www.reddit.com/r/learnprogramming/comments/neo7tb/making_sense_of_octoparses_timestamps/gyhedy4/
5/17/2021 7:38:58 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
neo7tb
t3_neo7tb
neo7tb
0
neo7tb
True
False
False
0
5
2
2
2
22.2222222222222
0
0
0
0
3
33.3333333333333
9
128, 128, 128
3.00378429517502
Solid
49.983781592107
No
433
Posted
5/17/2021 6:39:36 PM
Hi,
I've drawn some tweets using Octoparse's standard template.
I however am not sure of how to make sense of the indicated timestamps of the tweets after exported to excel. (see below)
Does anyone have insights into how i can make this useful?
&#x200B;
https://preview.redd.it/iifc9tpp6qz61.png?width=359&format=png&auto=webp&v=enabled&s=22bd8591f4421d264867f27d4d9d9a23782518ac
neo5t3
webscraping
Rectangulau
t3_neo5t3
https://www.reddit.com/r/webscraping/comments/neo5t3/making_sense_of_octoparse_timestamps/
5/17/2021 6:39:36 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Making sense of Octoparse timestamps
False
1
neo5t3
0
5
2
2
2
4.44444444444444
0
0
0
0
20
44.4444444444444
45
128, 128, 128
3
Solid
50
No
19
Posted
7/6/2022 11:58:08 PM
7月6日,据美国之音报道,脸书(Facebook)母公司Meta(元宇宙)星期二(7月5日)分别对两家数据采集网站提起诉讼,指控他们从脸书和Instagram抓取用户数据用于未经授权的使用。其中一起被诉讼的对象是中国一家国家高新企业的美国子公司。
Meta公司在诉状中说,位于加利福尼亚州的八爪鱼数据(Octopus Data)公司自2015年3月25日以来运营一个名为“八爪鱼”(Octoparse)的“非法服务”,该服务的目的是从包括脸书、Instagram、推特(Twitter)和亚马逊(Amazon)在内的诸多网站上“不当收集或‘抓取’用户帐户个人资料和其他信息。”
...
[时刻新闻](https://www.timednews.com/article/2022/07/06/20636.html)
&#x200B;
https://preview.redd.it/ulxa1bnud1a91.png?width=870&format=png&auto=webp&v=enabled&s=26f65e20f5f9c52c4ce98e872e96436a2133f87e
vt4go1
TimedNews
uaskmebefore
t3_vt4go1
https://www.reddit.com/r/TimedNews/comments/vt4go1/中企美国子公司八爪鱼数据抓用户数据_脸书提告/
7/6/2022 11:58:08 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
中企美国子公司“八爪鱼数据”抓用户数据 脸书提告
False
1
vt4go1
0
1
1
1
0
0
0
0
0
0
39
90.6976744186046
43
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
496
Posted
2/18/2021 3:38:03 PM
Hey
I need to extract this data from [https://battlefy.com/msi-gaming-arena](https://battlefy.com/msi-gaming-arena) and the methods and tools that I usually use *(Octoparse and Web Scraping from Google Chrome)* don't work for me with this page.
Any advice or suggestion?
https://preview.redd.it/1eqj4m2p99i61.png?width=948&format=png&auto=webp&v=enabled&s=42530ae64c64182380bd1f468c54f502ed015b84
lmpgsc
webscraping
juanchi_parra
t3_lmpgsc
https://www.reddit.com/r/webscraping/comments/lmpgsc/im_looking_for_effective_web_scraping_methods_and/
2/18/2021 3:38:03 PM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
I'm looking for effective web scraping methods and tools for this data
False
0.67
lmpgsc
0
4
28
28
1
2.12765957446809
0
0
0
0
23
48.936170212766
47
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
495
Posted
2/18/2021 3:41:57 PM
Hey
I need to extract this data from [https://battlefy.com/msi-gaming-arena](https://battlefy.com/msi-gaming-arena) and the methods and tools that I usually use *(Octoparse and Web Scraping from Google Chrome)* don't work for me with this page.
Any advice or suggestion?
lmpjou
learnpython
juanchi_parra
t3_lmpjou
https://www.reddit.com/r/learnpython/comments/lmpjou/im_looking_for_effective_web_scraping_methods_and/
2/18/2021 3:41:57 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
I'm looking for effective web scraping methods and tools for this data
False
1
lmpjou
0
4
28
28
1
2.12765957446809
0
0
0
0
23
48.936170212766
47
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
18
Commented
2/18/2021 4:08:14 PM
How does it not work? Is it throwing errors or just giving you unexpected results?
gnwc8hs
learnpython
BobHogan
t1_gnwc8hs
https://www.reddit.com/r/learnpython/comments/lmpjou/im_looking_for_effective_web_scraping_methods_and/gnwc8hs/
2/18/2021 4:08:14 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
lmpjou
t3_lmpjou
lmpjou
1
lmpjou
False
False
False
0
2
28
28
1
6.66666666666667
2
13.3333333333333
0
0
3
20
15
128, 128, 128
3
Solid
50
Yes
17
RepliedTo
2/18/2021 6:26:17 PM
They don't recognize the data. "There is no information to extract"
gnwwd8w
learnpython
juanchi_parra
t1_gnwwd8w
https://www.reddit.com/r/learnpython/comments/lmpjou/im_looking_for_effective_web_scraping_methods_and/gnwwd8w/
2/18/2021 6:26:17 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gnwc8hs
t1_gnwc8hs
gnwc8hs
1
lmpjou
True
False
False
1
1
28
28
0
0
0
0
0
0
4
36.3636363636364
11
128, 128, 128
3.00094607379376
Solid
49.9959453980268
Yes
16
RepliedTo
2/18/2021 7:00:19 PM
You can try using beautiful soup and requests to scrape it
gnx1iju
learnpython
BobHogan
t1_gnx1iju
https://www.reddit.com/r/learnpython/comments/lmpjou/im_looking_for_effective_web_scraping_methods_and/gnx1iju/
2/18/2021 7:00:19 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
gnwwd8w
t1_gnwwd8w
gnwwd8w
0
lmpjou
False
False
False
2
2
28
28
1
9.09090909090909
0
0
0
0
5
45.4545454545455
11
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
15
Posted
9/16/2022 12:03:11 PM
https://yournamewebsite.com/?p=11908
xfpfnp
yournamewebsite
DBlackLabel31
t3_xfpfnp
https://www.reddit.com/r/yournamewebsite/comments/xfpfnp/how_to_extract_data_from_website_to_excel/
9/16/2022 12:03:11 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Extract Data From Website to Excel Automatically If you've ever wanted to extract data from a website and put it into an Excel spreadsheet, you're in luck. There are plenty of ways to do this.You can use software like Octoparse, Microsoft Research Labs Excel 2007 Web Data Add-In, and VBA
False
1
xfpfnp
0
4
1
1
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
14
Posted
9/16/2022 12:03:10 PM
https://yournamewebsite.com/?p=11908
xfpfmz
yournamewebsite
DBlackLabel31
t3_xfpfmz
https://www.reddit.com/r/yournamewebsite/comments/xfpfmz/how_to_extract_data_from_website_to_excel/
9/16/2022 12:03:10 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Extract Data From Website to Excel Automatically If you've ever wanted to extract data from a website and put it into an Excel spreadsheet, you're in luck. There are plenty of ways to do this.You can use software like Octoparse, Microsoft Research Labs Excel 2007 Web Data Add-In, and VBA
False
1
xfpfmz
0
4
1
1
128, 128, 128
3
Solid
50
No
12
Commented
6/15/2021 7:12:55 PM
Fairly doable, you can either automate the manual process of inputing the address then screen scrape for the result (with Selenium & similar) or directly talk to the API as suggested by u/mental_diarrhea, which will give you this kind of response from their server, from which you can extract all what you need:
{"qualified":false,"reservation":null,"emailAddress":null,"addressLine1":"502 UTAH","addressLine2":"","city":"HULL","state":"TX","zipCode":"77564","reasonCode":"Non_Eligible","addressType":null,"installType":null,"floorToStreetsMap":{},"phoneNumber":null,"addressfromAccount":false,"currentFloorNumber":null,"eventCorrelationId":null,"verifyE911Address":false,"polylines":{},"launchType":null,"apartmentNumberRequired":false,"floorPlanAvailable":false,"addressMapStatus":null,"maxFloor":null,"buildingDetails":null,"uberPinEligible":false,"intersectionCoordinatesLst":null,"coveragePercentage":null,"equipType":null,"addressDescriptorList":null,"bundleNames":null,"qualified4GHome":false,"isPendingCart":false,"isExpiredCart":false,"isStreetSelected":false,...
h1vrd6l
webscraping
Meaveready
t1_h1vrd6l
https://www.reddit.com/r/webscraping/comments/o0dtm4/scraping_output_after_an_input/h1vrd6l/
6/15/2021 7:12:55 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o0dtm4
t3_o0dtm4
o0dtm4
1
o0dtm4
False
False
False
0
1
37
37
2
1.68067226890756
10
8.40336134453782
0
0
77
64.7058823529412
119
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
11
RepliedTo
6/15/2021 7:22:15 PM
I like talking to APIs whenever possible because they're not that big resource hogs, selenium can get pretty heavy even headless when you want to spawn hundreds of them for uniqueness. Apparently Verizon has some kind of security implemented, at least for number of requests (consecutive, not simultaneous, the latter can be even worse), and it's about 20. Spawning new selenium every 20 reqs is simply not efficient for large quantities to scrape, although if OP has less than 1k it can work out and the time difference will be negligible.
Besides, finding APIs is more hacky and I do it for funsies, so... 😃
h1vsnsr
webscraping
mental_diarrhea
t1_h1vsnsr
https://www.reddit.com/r/webscraping/comments/o0dtm4/scraping_output_after_an_input/h1vsnsr/
6/15/2021 7:22:15 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h1vrd6l
t1_h1vrd6l
h1vrd6l
2
o0dtm4
False
False
False
1
4
37
37
3
2.88461538461538
2
1.92307692307692
0
0
43
41.3461538461538
104
128, 128, 128
3
Solid
50
Yes
10
RepliedTo
6/15/2021 11:11:21 PM
That's really neat, it's like trying to find the cat flap then disguising as the house's cat to enter normally and open the fridge and see for yourself, instead of calling on the owner from the window, telling him to open the fridge and naming you what's in there x)
h1wmk0w
webscraping
Meaveready
t1_h1wmk0w
https://www.reddit.com/r/webscraping/comments/o0dtm4/scraping_output_after_an_input/h1wmk0w/
6/15/2021 11:11:21 PM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h1vsnsr
t1_h1vsnsr
h1vsnsr
1
o0dtm4
False
False
False
2
1
37
37
1
2
0
0
0
0
23
46
50
128, 128, 128
3.00283822138127
Solid
49.9878361940803
Yes
9
RepliedTo
6/16/2021 7:09:19 AM
I'd say it's more like wanting to watch neighbors' TV, so instead of buying binoculars, I break in, plug into their TV set, and pretend to be the wife so I can choose the channel. 😃
h1xyrbr
webscraping
mental_diarrhea
t1_h1xyrbr
https://www.reddit.com/r/webscraping/comments/o0dtm4/scraping_output_after_an_input/h1xyrbr/
6/16/2021 7:09:19 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h1wmk0w
t1_h1wmk0w
h1wmk0w
0
o0dtm4
False
False
False
3
4
37
37
0
0
2
5.71428571428571
0
0
14
40
35
128, 128, 128
3
Solid
50
No
13
Posted
6/15/2021 1:01:33 PM
I'm trying to do some web scraping where I enter data into an input field, and then capture the results the site spits out. As an example, go to [https://www.verizon.com/5g/home/#checkAvailability](https://www.verizon.com/5g/home/#checkAvailability), enter an address (from a spreadsheet), click check availability, and then record what the output is. I've tried a couple tools (Octoparse, Parsehub, Webscraper), but haven't been able to get it to work. I've done projects like this in the past using Mechanical Turk to quasi-automate it, but I'd like to be able to fully automate it. Certainly doesn't need to be free. Any suggestions?
o0dtm4
webscraping
BubbaJoe2000
t3_o0dtm4
https://www.reddit.com/r/webscraping/comments/o0dtm4/scraping_output_after_an_input/
6/15/2021 1:01:33 PM
1/1/0001 12:00:00 AM
False
False
4
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scraping output after an input?
False
1
o0dtm4
0
1
37
37
2
1.8348623853211
0
0
0
0
49
44.954128440367
109
128, 128, 128
3
Solid
50
No
8
Commented
6/15/2021 2:53:29 PM
Should be possible with Python (or basically anything that can send query; I'm using Python because I'm lazy), it uses some basic API calls and doesn't seem overly hard to get.
You can pm me for deets, if I'll be bored enough I can roll something for ya.
h1ur40t
webscraping
mental_diarrhea
t1_h1ur40t
https://www.reddit.com/r/webscraping/comments/o0dtm4/scraping_output_after_an_input/h1ur40t/
6/15/2021 2:53:29 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
o0dtm4
t3_o0dtm4
o0dtm4
0
o0dtm4
False
False
False
0
1
37
37
1
2.08333333333333
3
6.25
0
0
18
37.5
48
128, 128, 128
3
Solid
50
No
7
RepliedTo
6/15/2021 9:45:53 PM
This is good advice.
h1wc544
webscraping
MountProxies
t1_h1wc544
https://www.reddit.com/r/webscraping/comments/o0dtm4/scraping_output_after_an_input/h1wc544/
6/15/2021 9:45:53 PM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
h1vsnsr
t1_h1vsnsr
h1vsnsr
0
o0dtm4
False
False
False
2
1
37
37
1
25
0
0
0
0
1
25
4
128, 128, 128
3
Solid
50
Yes
6
Commented
12/7/2021 7:33:52 AM
Amazing article, I will save it. I am too far from web development, but I see hight potential in my country for this industry.
hnkhmsq
Octoparse_ideas
Logical_Bowl_5442
t1_hnkhmsq
https://www.reddit.com/r/Octoparse_ideas/comments/ppscvt/how_to_develop_and_grow_your_niche_job_board/hnkhmsq/
12/7/2021 7:33:52 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
ppscvt
t3_ppscvt
ppscvt
1
ppscvt
False
False
False
0
1
3
3
1
4.16666666666667
0
0
0
0
10
41.6666666666667
24
128, 128, 128
3
Solid
50
Yes
5
RepliedTo
12/8/2021 1:09:26 AM
Awesome\~
hnnwcrn
Octoparse_ideas
Octoparseideas
t1_hnnwcrn
https://www.reddit.com/r/Octoparse_ideas/comments/ppscvt/how_to_develop_and_grow_your_niche_job_board/hnnwcrn/
12/8/2021 1:09:26 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
hnkhmsq
t1_hnkhmsq
hnkhmsq
0
ppscvt
True
False
False
1
1
3
3
1
100
0
0
0
0
0
0
1
Red
10
Dash Dot Dot
20
No
1092
Posted
12/21/2021 8:32:38 AM
[http://www.dataextraction.io/?p=1181/?re=](http://www.dataextraction.io/?p=1181/?re=)
It’s interesting to read articles about remote working — how remote workers shall be managed, why you shall wear pants while working at home. After reading about how working from home is fueling the world, a new article soon jumps into your face — remote working is failing Silicon Valley.
Remote work has become a hot topic and, it is the trend.
If you are planning to learn new skills and make money from home, you may be interested in web scraping.
* When everything becomes data-driven, [all walks of life are looking for data](https://www.octoparse.com/WebScraping).
* Web data scraping can be done everywhere, single-handedly.
* Web scraping is so versatile that it can benefit your work unexpectedly.
You may get a sense of web scraping later. Let’s dive right in.
# What Data Are People Paying For
Some of [Octoparse](https://www.octoparse.com/)’s users are freelancers. They are doing what exactly I am talking about — offering web scraping services to help the employer get structured web data.
You may be curious about what kinds of data people are paying for. I have visited a few job sites and found out the most typical cases. In these cases, clients are looking for a guy who can deliver quality web data efficiently to them. This data is important to their business or research.
This data is accessible if you get a hang of Octoparse, a no-code web scraping tool. While I will discuss it later.
Let’s see what data is regarded as valuable to the client.
## Ecommerce Product Data
[Ecommerce data scraping](https://www.octoparse.com/blog/9-ways-ecommerce-data-can-fuel-your-online-business) is always in high demand. That’s why Octoparse has developed many [ready-to-use templates](https://service.octoparse.com/webscrapingtemplates) to scrape data from Amazon, eBay, Etsy, etc.
Online sellers are gathering product data, especially prices, to learn about the market and reinvent their assortment or adjust their pricing strategy.
This is a job posting on Freelancer.com, looking for a web scraping expert to grab 150k eCommerce product data. (And believe me, you don’t have to be such an expert to complete the job.)
[Source: Freelancer.com](https://preview.redd.it/bfzqo1jmwu681.png?width=757&format=png&auto=webp&v=enabled&s=cc890bd9a99889ddb1ab72a356909ee335d7c431)
This is a web scraping job rather than [data mining](https://www.investopedia.com/terms/d/datamining.asp). I bet they will be the ones to mine this data afterward, looking for insights to guide their decision-making process.
**Keyword:** data mining
**Industry:** eCommerce
**Data:** eCommerce product listings
**Rate:** around $150
## Email List Building
There are always people looking for email addresses. If you are a marketer, you must be receiving emails asking if you want to buy his/her email list. My thoughts: I am working in the web scraping industry and I have my [email scraper](https://www.octoparse.com/blog/extract-emails-from-any-website-for-cold-email-marketing).
[Source: Upwork.com](https://preview.redd.it/wfod790swu681.png?width=708&format=png&auto=webp&v=enabled&s=b6f693fa99196131c76d970efb7828fb59490570)
Web scraping is widely used by marketers to generate sales leads. The only problem is to figure out where your target has been gathering around. For example, if you are looking for writers, look at Medium; if you are seeking CEOs, go to LinkedIn.
**Keyword**: lead generation
**Industry**: unknown
**Data**: emails of decision-makers
**Rate**: $40-$100/hour
## Phone Numbers Scraping
Besides email marketing, text message marketing also works wonders for some industries, for example, restaurants. And [Yelp](https://helpcenter.octoparse.com/hc/en-us/articles/900006618403-Scrape-business-information-from-Yelp) and [Google Maps](https://helpcenter.octoparse.com/hc/en-us/articles/900002292706-Scrape-Business-Information-from-Google-Maps) are perfect places to get numbers of local businesses.
However, this employer could make different use of the scraped phone numbers.
[Source: Freelancer.com](https://preview.redd.it/powvmr8vwu681.png?width=772&format=png&auto=webp&v=enabled&s=0dc380d8c9485866c0cb0103977550bc2159e5a9)
**Keyword**: WhatsApp, extract telephone numbers
**Data**: telephone numbers
**Industry**: unknown (probably home services provider)
**Rate**: around $7/hour (this guy is not going to find a right fit)
## LinkedIn Profile / CV
Yes, there is always a constant need for LinkedIn web scraping for there are a great number of high-quality businesses and talents active on the platform. While scraping LinkedIn is a bit sensitive. In Europe, personal information is protected by [GDPR](https://www.octoparse.com/blog/gdpr-compliance-in-web-scraping) and legality should be considered if you are scraping personal information for business purposes.
[Source: Guru.com](https://preview.redd.it/td0r4h0ywu681.png?width=704&format=png&auto=webp&v=enabled&s=d8697f90c37c50ce0f1ffea79749a31c5bfc17e9)
**Keyword**: LinkedIn web scraping
**Data**: LinkedIn profiles
**Industry**: unknown
**Rate**: $250-$500
## Content Curation for App/Web Development
People use web scraping to gather data for analysis or contact information for marketing purposes. In the meantime, web scraping is also used to grab data to feed a website or an application (or for data migration).
This is typically true for [an aggregator website](https://www.octoparse.com/blog/how-to-create-an-aggregator-website).
There is a group of people making money by building up websites (eg. price comparison website) with web scraping data. And some are [building up a business/company](https://www.octoparse.com/blog/octoparse-customer-stories-use-web-scraping-to-make-money). Check out these [web scraping business ideas](https://www.octoparse.com/blog/10-web-scraping-business-ideas-for-everyone), you may find a startup idea to work with.
# How to Get Web Data Like a Pro
Octoparse is designed as a no-code web scraping tool for non-coders to crawl data from the web.
What does it mean to crawl from the web?
Simply put, it is to get data from web pages into Excel, CSV, or other file formats for offline use. This is an oversimplified version to explain it but that’s how it works for most people.
For ordinary daily use, we get the data from a web page down to a file so we can compile, process, analyze or simply save a copy.
A no-code web scraping tool like Octoparse makes this job done more easier than before. Or else, building and maintaining a web scraper can be extremely [sophisticated](https://www.octoparse.com/blog/9-web-scraping-challenges).
So how to get web data using Octoparse?
Octoparse will load the URLs given by the users and render the entire website on the built-in browser. As a result, you can point and click the data in the browser and set commands to tell the robot what data to scrape.
[https://youtu.be/yu8vUFIMYzE](https://youtu.be/yu8vUFIMYzE)
For example, you might want to extract business data from Yelp. All you have to do is to paste the URL in Octoparse, select desired data, set the command to extract. Octoparse will do the data scraping for you.
[Download Octoparse Free](https://www.octoparse.com/download) here and try it out with this step-by-step guide! You can be a data provider with a smart data grabber.
[>>Scrape from Yelp](https://helpcenter.octoparse.com/hc/en-us/articles/900005216263-Scrape-customer-reviews-from-Yelp-Version-8-)
# Conclusions
No-code tools make no-coders work efficiently like a programmer.
Octoparse is about making web data accessible for everyone, especially business owners who need data for smart decision-making and data scientists who rely on data to get closer to the truth.
Try it out. Have fun with data.
rla6pu
u_Octoparseideas
Octoparseideas
t3_rla6pu
https://www.reddit.com/r/u_Octoparseideas/comments/rla6pu/remote_working_make_money_with_web_scraping/
12/21/2021 8:32:38 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Remote Working: Make Money with Web Scraping
False
1
rla6pu
0
32057
3
3
33
2.54826254826255
9
0.694980694980695
0
0
687
53.0501930501931
1295
Red
10
Dash Dot Dot
20
No
1091
Posted
11/23/2021 7:17:30 AM
TAKE 30% OFF when Renew or Upgrade
【Standard Year】SAVE $200!
【Professional Year】SAVE $500!
👉 Get free crawlers & 1-on-1 training: [https://www.octoparse.com/2021-black-friday-sale/?re=](https://www.octoparse.com/2021-black-friday-sale/?re=)
Check out the services you need:
[https://service.octoparse.com/data-service/?re=](https://service.octoparse.com/data-service/?re=)
[https://service.octoparse.com/ecommercedata/?re=](https://service.octoparse.com/ecommercedata/?re=)
[https://service.octoparse.com/socialmedia/?re=](https://service.octoparse.com/socialmedia/?re=)
[https://service.octoparse.com/contentaggregation/?re=](https://service.octoparse.com/contentaggregation/?re=)
[https://service.octoparse.com/enterprise/?re=](https://service.octoparse.com/enterprise/?re=)
[https://service.octoparse.com/webscrapingtemplates/?re=](https://service.octoparse.com/webscrapingtemplates/?re=)
r07m95
Octoparse_ideas
Octoparseideas
t3_r07m95
https://www.reddit.com/r/Octoparse_ideas/comments/r07m95/hurry_up_black_friday_is_ending_soon/
11/23/2021 7:17:30 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
👏 Hurry up! Black Friday Is Ending Soon
False
1
r07m95
0
32057
3
3
1
0.826446280991736
0
0
0
0
79
65.2892561983471
121
Red
10
Dash Dot Dot
20
No
1090
Posted
10/8/2021 7:21:05 AM
Participate in the event on Twitter and win FREE gifts: [https://hubs.la/H0Z13tf0](https://hubs.la/H0Z13tf0)
https://preview.redd.it/0egsb1gwg6s71.png?width=1600&format=png&auto=webp&v=enabled&s=697ba9fc382fa9e7da9a7a400a30d6f89a04a4ef
🌟Octoparsing with Zapier🌟
3 steps to win gifts worth $270:
1. Connect Octoparse with Zapier & Export your cloud data to any app.
2. Take a screenshot showing the successful data export.
3. Share your screenshot & feedback quoting the event tweet.
Join us and play!
q3sqmf
Octoparse_ideas
Octoparseideas
t3_q3sqmf
https://www.reddit.com/r/Octoparse_ideas/comments/q3sqmf/participate_in_octoparsing_with_zapier_event_on/
10/8/2021 7:21:05 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Participate in "Octoparsing with Zapier" event on Twitter and win FREE gifts
False
1
q3sqmf
0
32057
3
3
5
7.8125
1
1.5625
0
0
32
50
64
Red
10
Dash Dot Dot
20
No
1089
Commented
10/8/2021 7:22:12 AM
\- Gold: Amazon Gift Card $20 + Custom Crawler Coupon $250
\- Silver: Amazon Gift Card $10
\- Bronze: Amazon Gift Card $5
Find the tutorial here: https://helpcenter.octoparse.com/hc/en-us/articles/4406338353689-How-to-Connect-Octoparse-with-Zapier
hftuk2w
Octoparse_ideas
Octoparseideas
t1_hftuk2w
https://www.reddit.com/r/Octoparse_ideas/comments/q3sqmf/participate_in_octoparsing_with_zapier_event_on/hftuk2w/
10/8/2021 7:22:12 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
q3sqmf
t3_q3sqmf
q3sqmf
0
q3sqmf
True
False
False
0
32057
3
3
1
4.34782608695652
0
0
0
0
19
82.6086956521739
23
Red
10
Dash Dot Dot
20
No
1088
Posted
3/1/2022 6:25:47 AM
Check out this video introducing new features and improvements in Octoparse 8.5 by quick demo.
Complete video👉https://youtu.be/nVycXF3np1o
https://reddit.com/link/t41d6k/video/kxkefxg4upk81/player
t41d6k
u_Octoparseideas
Octoparseideas
t3_t41d6k
https://www.reddit.com/r/u_Octoparseideas/comments/t41d6k/what_are_the_highlights_in_octoparse_85_version/
3/1/2022 6:25:47 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What are the highlights in Octoparse 8.5 version?
False
1
t41d6k
0
32057
3
3
1
4.54545454545455
0
0
0
0
11
50
22
Red
10
Dash Dot Dot
20
No
1087
Posted
3/22/2022 3:21:36 AM
Imagine your company has just developed a new type of shampoo. Wouldn’t it be great to have a list of all the beauty salon shops near you as well as those across the entire nation? Wouldn’t it be even better if you could conveniently locate their contact details, addresses, email addresses, phone numbers, and Facebook page?
This is what lead generation does. In this article, we are going to talk about lead generation and how to harvest sales leads (email addresses) from websites.
[https://www.octoparse.com/blog/email-extractor-geathering-sales-leads-in-minutes/?re=](https://www.octoparse.com/blog/email-extractor-geathering-sales-leads-in-minutes/?re=)
tjtorp
u_Octoparseideas
Octoparseideas
t3_tjtorp
https://www.reddit.com/r/u_Octoparseideas/comments/tjtorp/how_to_use_email_extractors_to_collect_sales/
3/22/2022 3:21:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Use Email Extractors to Collect Sales Leads in Minutes
False
1
tjtorp
0
32057
3
3
10
9.00900900900901
0
0
0
0
49
44.1441441441441
111
Red
10
Dash Dot Dot
20
No
1086
Posted
10/25/2021 4:07:41 AM
[View Poll](https://www.reddit.com/poll/qf8nqw)
qf8nqw
u_Octoparseideas
Octoparseideas
t3_qf8nqw
https://www.reddit.com/r/u_Octoparseideas/comments/qf8nqw/whats_your_favorite_new_feature_in_octoparse_842/
10/25/2021 4:07:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What's your favorite new feature in Octoparse 8.4.2 version?
False
1
qf8nqw
0
32057
3
3
0
0
0
0
0
0
5
62.5
8
Red
10
Dash Dot Dot
20
No
1085
Posted
9/8/2021 1:33:26 AM
A place for members of r/Octoparse_ideas to chat with each other
pk0ql2
Octoparse_ideas
Octoparseideas
t3_pk0ql2
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/
9/8/2021 1:33:26 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
r/Octoparse_ideas Lounge
False
0.67
pk0ql2
0
32057
3
3
0
0
0
0
0
0
5
41.6666666666667
12
Red
10
Dash Dot Dot
20
No
1084
Posted
8/2/2022 7:53:03 AM
[https://youtu.be/9Y0JWOuDXS0](https://youtu.be/9Y0JWOuDXS0)
😊Hi all, we have recently released the new version 8.5.4! This video introduces new features and improvements for the new version as a quick demo. If you have any problems or feedback, leave us a comment!
we5m0q
u_Octoparseideas
Octoparseideas
t3_we5m0q
https://www.reddit.com/r/u_Octoparseideas/comments/we5m0q/whats_new_in_octoparse_854/
8/2/2022 7:53:03 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What's new in Octoparse 8.5.4?
False
1
we5m0q
0
32057
3
3
1
2.17391304347826
1
2.17391304347826
0
0
17
36.9565217391304
46
Red
10
Dash Dot Dot
20
No
1083
Posted
6/22/2022 9:15:01 AM
Here is a list of the 30 most popular web scraping software. I just put them together under the umbrella of software, while they range from open-source libraries and browser extensions to desktop software and more.
[https://www.octoparse.com/blog/top-30-free-web-scraping-software/?utm\_source=sale2022&utm\_medium=top30freewebscrapingsoftware&utm\_campaign=reddit](https://www.octoparse.com/blog/top-30-free-web-scraping-software/?utm_source=sale2022&utm_medium=top30freewebscrapingsoftware&utm_campaign=reddit)
vi0xuz
u_Octoparseideas
Octoparseideas
t3_vi0xuz
https://www.reddit.com/r/u_Octoparseideas/comments/vi0xuz/top_30_free_web_scraping_software_in_2022/
6/22/2022 9:15:01 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 30 Free Web Scraping Software in 2022
False
1
vi0xuz
0
32057
3
3
5
6.41025641025641
0
0
0
0
46
58.974358974359
78
Red
10
Dash Dot Dot
20
No
1082
Posted
6/29/2021 2:47:14 AM
[removed]
o9ze7l
u_Octoparseideas
Octoparseideas
t3_o9ze7l
https://www.reddit.com/r/u_Octoparseideas/comments/o9ze7l/how_to_scrape_data_save_information_from_any/
6/29/2021 2:47:14 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Data, Save Information from ANY Website for Offline Viewing?
False
1
o9ze7l
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
1081
Posted
2/17/2022 7:49:00 AM
🤔When using Octoparse, do you prefer local running or cloud running? Why?
🤗Make the choice, comment with your reasons, and win a 1-month free extension!
[View Poll](https://www.reddit.com/poll/sujb76)
sujb76
u_Octoparseideas
Octoparseideas
t3_sujb76
https://www.reddit.com/r/u_Octoparseideas/comments/sujb76/local_scraping_vs_cloud_scraping/
2/17/2022 7:49:00 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Local scraping vs. Cloud scraping
False
1
sujb76
0
32057
3
3
3
8.82352941176471
1
2.94117647058824
0
0
16
47.0588235294118
34
Red
10
Dash Dot Dot
20
No
1080
Posted
9/2/2022 6:47:57 AM
Web scraping makes it possible to gather and store a lot of data quickly. This article will assist you in scraping customer evaluations from Trustpilot, an online review platform.
[https://www.octoparse.com/blog/how-to-scrape-trustpilot/?utm\_source=2022q3&utm\_medium=how-to-scrape-trustpilot&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-scrape-trustpilot/?utm_source=2022q3&utm_medium=how-to-scrape-trustpilot&utm_campaign=reddit)
x3ulm3
u_Octoparseideas
Octoparseideas
t3_x3ulm3
https://www.reddit.com/r/u_Octoparseideas/comments/x3ulm3/best_trustpilot_scraper_to_get_data_from/
9/2/2022 6:47:57 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Best Trustpilot Scraper to Get Data from Trustpilot Easily
False
1
x3ulm3
0
32057
3
3
0
0
0
0
0
0
43
59.7222222222222
72
Red
10
Dash Dot Dot
20
No
1079
Posted
3/28/2022 8:49:46 AM
[https://www.octoparse.com/blog/what-is-a-web-crawler-and-how-does-it-work-at-your-benefit/?re=](https://www.octoparse.com/blog/what-is-a-web-crawler-and-how-does-it-work-at-your-benefit/?re=)
A web crawler, also known as a web spider or search engine bot, is a bot that visits and indexes the content of web pages all over the Internet. In this article, we will guide you through the way showing how web crawlers work and the differences between web crawling and web scraping.
https://preview.redd.it/d99sldvl83q81.jpg?width=800&format=pjpg&auto=webp&v=enabled&s=ccfb64adfebcd65412d0fd29dce22b13700f26d7
tq4wzh
Octoparse_ideas
Octoparseideas
t3_tq4wzh
https://www.reddit.com/r/Octoparse_ideas/comments/tq4wzh/what_is_a_web_crawler_and_how_does_it_work/
3/28/2022 8:49:46 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What Is a Web Crawler and How Does It Work
False
1
tq4wzh
0
32057
3
3
5
5.49450549450549
0
0
0
0
37
40.6593406593407
91
Red
10
Dash Dot Dot
20
No
1078
Posted
6/29/2021 2:47:14 AM
[removed]
o9ze7l
u_Octoparseideas
Octoparseideas
t3_o9ze7l
https://www.reddit.com/r/u_Octoparseideas/comments/o9ze7l/how_to_scrape_data_save_information_from_any/
6/29/2021 2:47:14 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Data, Save Information from ANY Website for Offline Viewing?
False
1
o9ze7l
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1077
Posted
4/26/2023 7:03:02 AM
Despite slowing growth, eCommerce sales are expected to reach $6.3 trillion by 2023. To stay ahead of the competition, utilizing data through web scraping is crucial for success in the highly digital eCommerce industry.
[https://www.octoparse.com/blog/web-scraping-and-ecommerce-business#](https://www.octoparse.com/blog/web-scraping-and-ecommerce-business#)
12z9f3d
Octoparse_ideas
Octoparseideas
t3_12z9f3d
https://www.reddit.com/r/Octoparse_ideas/comments/12z9f3d/revolutionize_your_online_business_with_web/
4/26/2023 7:03:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Revolutionize Your Online Business with Web Scraping: Unleashing the Power of Data
False
1
12z9f3d
0
32057
3
3
1
1.81818181818182
0
0
0
0
33
60
55
Red
10
Dash Dot Dot
20
No
1076
Posted
11/22/2021 7:58:53 AM
[http://www.dataextraction.io/?p=1163/?re=](http://www.dataextraction.io/?p=1163/?re=)
As a hard-core fan of various kinds of music, I am always eager to know what’s new and popular on Billboard. Since I also play music and write music reviews myself, I need to analyze the latest hits on the website. But it can be time-consuming to copy and paste the list of songs manually, what should I do to speed up the process?
Thanks to Octoparse, I can complete the list crawling task with the help of the Billboard music scraper created by myself. In this article, I will demonstrate how [web scraping](https://en.wikipedia.org/wiki/Web_scraping) works to crawl the hot 100 songs on Billboard. The same way is also feasible for scraping listings from websites of different industries.
# Build A Billboard Music Scraper in 3 Steps
Although I know nothing about coding, I can use [Octoparse](https://www.octoparse.com/download/windows) to set up my own Billboard music scraper in only 3 steps. What I intend to do is to extract the information about the hot 100 songs. Normally, I ought to visit [Billboard.com](https://www.billboard.com/), find the “[Hot 100](https://www.billboard.com/charts/hot-100)” section, enter the section, and start to copy and paste the data I need. Now, the whole process can be done with a scraping bot.
# Step 1 Enter the URL of Billboard and Find the Music Listings You Would Like to Crawl
After launching the software and logging in, I need to enter the URL of Billboard, and click on **“Start”**.
https://preview.redd.it/l5jyx7v9s3181.png?width=1400&format=png&auto=webp&v=enabled&s=01e1d1c8ea1cd95cbe3a3745bffb80544398d6c7
After the webpage finishes loading in Octoparse’s built-in browser, I should click on the **“Hot 100”** section under the scraping mode, and select **“Click URL”** on the **“Tips”** panel.
https://preview.redd.it/yxr1k7zbs3181.png?width=1400&format=png&auto=webp&v=enabled&s=eb05bb239b058724fc2378abf37cbc5f02eba924
# Step 2 Generate the Workflow of Your Billboard Music Scraper
Now, I can click on **“Auto-detect web data”** to find data on the page automatically.
https://preview.redd.it/d10bk0ees3181.png?width=1400&format=png&auto=webp&v=enabled&s=8017f4e8723d54c2a88954e8d9ce7fbe05d7b76c
Then, I need to switch auto-detect results to the **“hot 100”** chart. Since the whole 100 songs are on the same page, and I can scrape while scrolling, it is not necessary to **“Paginate to scrape more pages”**. I just uncheck the box. After clicking on **“Create workflow”**, I manage to make the Billboard music scraper by myself.
https://preview.redd.it/mc15ztxgs3181.png?width=1400&format=png&auto=webp&v=enabled&s=1bc5f270f019cdc7324ecce9928053448e757917
# Step 3 Run the Task You Build and Extract the Data
After saving the task, I am ready to run the task and extract the data I need. Hit **“Run”**, and Octoparse will start to work for me. Since I’m a premium user, I can choose to extract the data either on my local device or on the cloud.
https://preview.redd.it/p99be9zhs3181.png?width=1160&format=png&auto=webp&v=enabled&s=fe28a51165cee8e9cd934087385e80e055a4e342
Free users can only extract the data locally. [Cloud extraction](https://helpcenter.octoparse.com/hc/en-us/articles/360018047092-What-is-Cloud-Extraction-) is available for those who go premium, which is more convenient that data can be saved to the cloud for easy access. Besides, the task can be scheduled to run at any time.
I decide to run the task on my device this time. Then, tada! The hot 100 songs data is extracted in seconds.
https://preview.redd.it/y90rjtmjs3181.png?width=1400&format=png&auto=webp&v=enabled&s=92921da277eb26a3b83828cea9c97adc42ba330b
# List Crawling Examples: What Kinds of Listings Are Most Frequently Scraped?
The process of scraping Billboard hot 100 songs is impressive. The same way can be applied to list crawling in a variety of industries. Let’s look at the most frequently scraped listings below.
# Real Estate Listings
It is the most important thing to find a suitable house to rent when we move to a new city. Skimming through the real estate websites to check the house listings one by one is not a delightful experience. Gathering all the houses available for renting through web scraping can get rid of the tedious and repetitive manual work. For real estate agents, scraping the listings is also an efficient way to help satisfy their customers’ demands.
# E-Commerce Product Listings
Product listings on e-commerce platforms are abundant resources for both merchants and customers. List crawling is so popular in the e-commerce industry that almost all store owners [scrape product listings](https://www.octoparse.com/blog/the-easiest-way-to-extract-data-from-e-commerce-websites) for price monitoring and market research. For e-commerce startups, product scraping is also beneficial for selecting potential products to sell and optimizing the business strategy.
# Job Listings
Job seekers, especially fresh graduates, are frequent visitors to job search websites like Indeed, LinkedIn, Glassdoor, etc. Thousands of jobs across several niches are posted on these websites every day. For job searchers, [scraping job listings](http://www.dataextraction.io/?p=1125) can speed up the process of finding their dream job.
# Travel Data and Hotel Listings
Although Covid-19 is still haunting, people who love traveling are always ready for a refreshing trip. By [scraping travel data and hotel listings](https://www.octoparse.com/blog/tripadvisor-scraper-top-destinations-open-to-the-us-citizens-under-covid), travelers can pull down the needed web data and find destinations open to them currently. For travel agencies, list crawling in this industry offers a chance to track the tourists’ behaviors and understand their habits better.
# Conclusion
The listings mentioned above can be scraped using Octoparse through both [built-in templates](https://helpcenter.octoparse.com/hc/en-us/articles/900003158843-Task-Templates-Version-8-) and self-created crawlers. With the no-coding web scraping tool, list crawling in all industries is ready to go. Enjoy your journey with Octoparse, and feel free to contact us at [support@octoparse.com](mailto:support@octoparse.com) if there are any problems.
qzgda9
u_Octoparseideas
Octoparseideas
t3_qzgda9
https://www.reddit.com/r/u_Octoparseideas/comments/qzgda9/list_crawling_build_a_billboard_music_scraper_for/
11/22/2021 7:58:53 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
List Crawling: Build A Billboard Music Scraper for Free
False
1
qzgda9
0
32057
3
3
42
4.48239060832444
9
0.96051227321238
0
0
464
49.5197438633938
937
Red
10
Dash Dot Dot
20
No
1075
Posted
6/24/2022 8:36:41 AM
The online job market has undoubtedly overridden in-person hiring activities. This is especially true when most cities around the globe face rounds of lock-down and more jobs shift to a remote mode since COVID-19. In this sense, scraping job postings data helps not only institutions and organizations but also individual job seekers.
[https://www.octoparse.com/blog/web-scraping-job-postings/?utm\_source=sale2022&utm\_medium=webscrapingjobpostings&utm\_campaign=reddit](https://www.octoparse.com/blog/web-scraping-job-postings/?utm_source=sale2022&utm_medium=webscrapingjobpostings&utm_campaign=reddit)
vjjoor
Octoparse_ideas
Octoparseideas
t3_vjjoor
https://www.reddit.com/r/Octoparse_ideas/comments/vjjoor/a_complete_guide_to_web_scraping_job_postings/
6/24/2022 8:36:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
A Complete Guide to Web Scraping Job Postings
False
1
vjjoor
0
32057
3
3
0
0
0
0
0
0
62
67.3913043478261
92
Red
10
Dash Dot Dot
20
No
1074
Posted
11/26/2021 7:09:24 AM
[https://www.octoparse.com/blog/how-freelancers-make-money-using-web-scraping/?re=](https://www.octoparse.com/blog/how-freelancers-make-money-using-web-scraping/?re=)
A few days ago, I talked to James and found his experience as a freelancer quite inspiring. He started as an excel panelist and now his clients have been all over the world. So I record the story and some real cases he had done, hoping this can give you some ideas when you want to start your freelance career on web scraping.
# What Is Web Scraping?
[Web scraping](https://www.octoparse.com/WebScraping) is a process of collecting data in an automated fashion. Generally, companies use web scraping for price monitoring, customer profiling, lead generation, targeted advertising to make smarter decisions.
# Why Become A Freelancer for Web Scraping?
Web scraping is getting more and more popular. And the demand for web scraping services is high and still rising.
Back in 2019, when James worked as a cryptocurrency excel panelist, he occasionally found out that all the raw data came from web scraping, which aroused his great interest.
“Data is the new oil, I felt like this was the market you have to jump in right now and see how it can make a profit,” he told me. That’s the time when he decided to go freelance with web scraping and finally started to earn a living on it.
# What the Freelancer Had Done with Web Scraping?
With not many words, let’s get into the topic. Usually, James works for three kinds of industry, eCommerce, Real-estate, and Marketing.
## I. Data Entry for eCommerce Sellers
Case description:
Product import and export for WooCommerce store owners. It appears like scraping data from one website ( maybe their old web or supplier website) to move the content over to a new website, where export or API is not available, and large copy and paste work is needed.
Related job from freelancing website:
[Enter products into WooCommerce product](https://www.freelancer.com/projects/data-entry/enter-products-into-woocommerce-product/?ngsw-bypass=&w=f)
[WooCommerce copy the image’s attributes](https://www.freelancer.com/projects/woocommerce/woocommerce-copy-the-images-attributes/?ngsw-bypass=&w=f)
The tool he used for these kinds of jobs: Octoparse + WP All Import
Step1: Get Product Data with Octoparse, a no-code web scraping tool, to get the product data you need and store them in the format of CSV. Here’s the guide on[ how to extract data from an eCommerce website](https://www.octoparse.com/blog/extract-data-with-auto-detection).
Step2: Upload the data to your website using WP ALL import. [Follow the guide](https://learnwoo.com/import-products-wp-all-import-woocommerce/) to upload them to the store.
## II. Price Comparison for Market Research
Case description:
People always need web scraping to get real-time price data for their market research or pricing strategy. These kinds of work appear to be scraping pricing and other information on products from eCommerce websites, vehicles on dealership sites, trips on travel sites, or property information from real estate sites.
Related job from freelancing website:
[Developer needed to add price comparison for commerce website](https://www.upwork.com/freelance-jobs/apply/Developer-needed-add-price-comparison-for-commerce-website_~010cf5655fc8814fe5/)
Preview of the job description:
* Write code to automatically scrap price data from other websites for a list of products
* Redesign our database (MongoDB) and reorganize information about a product in the back-end to allow price comparison (around 170k SKU).
* Modify the front-end of the website to show price comparison
Way to improve the efficiency:
As is shown in this job, you may need to have some programming skills for later database management. But the pay is proportional to return.
But the way to save your energy is that you can use web scraping tools to help you automatically scrap price data. It can save you large time on coding. And can realize all the functions you need such as regular run, export data to your database with API.
## III. Leads Generation for Marketing
Case description:
Leads are people who have shown interest in your products or service. Every sale starts with a lead. When businesses want to make more profits, chances are they need to keep generating more leads.
Leads generation can be easy with web scraping, Here’s the tutorial of how you can achieve [lead generation with web scraping](https://www.octoparse.com/blog/lead-generation-with-web-scraping). You can scrape lead information from directories: either individual contact information or company information to populate CRMs. For example, scraping platforms such as Yelp or Yellow Pages.
Related job from freelancing website: [Leads generation](https://www.guru.com/m/find/freelance-jobs/lead-generation/)
## IV. Real-Estate Listing Scraping
Case description:
Trying to find and extract real estate data manually can be a long and tedious process. While web scraping can be quite useful, here’s the [Realtor scraping tutorial](https://helpcenter.octoparse.com/hc/en-us/articles/360018559152-Scrape-real-estate-data-on-Realtor-com). And it can be quite easy for you to make money on it.
By scraping a real estate website to extract useful information, you can use the list for Price comparison, create a list of properties for clients, industry insights, etc. Information usually comes from the real-estate website such as Realtor, Zillow, etc.
# How can Web Scraping Tools Help in These Processes?
James has knowledge of Python, but he still prefers using web scraping tools to fulfill the tasks. Because of several reasons:
1. Compare with a programming language, it is more convenient. You can build a crawler without writing a single line of code! The web scraping tools are quite powerful and can fulfill your needs. He can build crawlers within clicks and run the crawlers to get the data.
2. All the bothering things you can imagine, these tools have prepared a comprehensive solution for your web scraping journey. You don’t have to worry about the IP being blocked, or being backlisted, or cookie walls.
3. He had tried many web scraping tools, but he still decides to go with [Octoparse](https://www.octoparse.com/). “The biggest reason is the price advantage,” he said. Octoparse has all the powerful features you need for web scraping but still goes at an affordable price when compared with others.
# Final Thoughts
Web scraping is not only about data. By selling that data, you can help someone grow their business and make more money yourself. Wanna start your freelancing career? Why not try web scraping!
r2hej4
Octoparse_ideas
Octoparseideas
t3_r2hej4
https://www.reddit.com/r/Octoparse_ideas/comments/r2hej4/how_freelancers_make_money_using_web_scraping/
11/26/2021 7:09:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How Freelancers Make Money, Using Web Scraping
False
1
r2hej4
0
32057
3
3
35
3.16455696202532
5
0.452079566003617
0
0
594
53.7070524412297
1106
Red
10
Dash Dot Dot
20
No
1073
Posted
6/27/2022 4:00:06 PM
Sale ends TONIGHT at 11:59 pm ET.
✨ Save 30% OFF with an annual plan
✨ Get FREE custom crawlers & training
Save now and get prepared for a new data scraping journey.
[https://www.octoparse.com/summer-sale-2022/?utm\_source=redditlast&utm\_medium=counting0628&utm\_campaign=22summersale](https://www.octoparse.com/summer-sale-2022/?utm_source=redditlast&utm_medium=counting0628&utm_campaign=22summersale)
https://preview.redd.it/ccisx7jas6891.png?width=800&format=png&auto=webp&v=enabled&s=73df0e18d429236cf2ff9e4d703167bc31003d3d
vlytzi
u_Octoparseideas
Octoparseideas
t3_vlytzi
https://www.reddit.com/r/u_Octoparseideas/comments/vlytzi/summer_sale_2022_ends_today/
6/27/2022 4:00:06 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
⌛ Summer Sale 2022 Ends Today!
False
1
vlytzi
0
32057
3
3
1
1.53846153846154
0
0
0
0
41
63.0769230769231
65
Red
10
Dash Dot Dot
20
No
1072
Posted
12/28/2021 3:57:05 AM
https://v.redd.it/yqrdufn8i7881
rq625b
u_Octoparseideas
Octoparseideas
t3_rq625b
https://www.reddit.com/r/u_Octoparseideas/comments/rq625b/how_to_scrape_app_reviews_from_google_play_with/
12/28/2021 3:57:05 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape app reviews from Google Play with Octoparse (https://youtu.be/OneU-njIsXE)
False
1
rq625b
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1071
Commented
12/28/2021 3:57:51 AM
Original video: https://youtu.be/OneU-njIsXE
hq8if47
u_Octoparseideas
Octoparseideas
t1_hq8if47
https://www.reddit.com/r/u_Octoparseideas/comments/rq625b/how_to_scrape_app_reviews_from_google_play_with/hq8if47/
12/28/2021 3:57:51 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
rq625b
t3_rq625b
rq625b
0
rq625b
True
False
False
0
32057
3
3
0
0
0
0
0
0
2
100
2
Red
10
Dash Dot Dot
20
No
1070
Posted
12/9/2021 7:25:29 AM
https://youtu.be/4a2KY58JHII
rccgzi
Octoparse_ideas
Octoparseideas
t3_rccgzi
https://www.reddit.com/r/Octoparse_ideas/comments/rccgzi/what_can_octoparse_do_to_help_your_business/
12/9/2021 7:25:29 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What can Octoparse do to Help your Business
False
1
rccgzi
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1069
Posted
8/2/2022 7:54:01 AM
[https://youtu.be/9Y0JWOuDXS0](https://youtu.be/9Y0JWOuDXS0)
😊Hi all, we have recently released the new version 8.5.4! This video introduces new features and improvements for the new version as a quick demo. If you have any problems or feedback, leave us a comment!
we5mkm
Octoparse_ideas
Octoparseideas
t3_we5mkm
https://www.reddit.com/r/Octoparse_ideas/comments/we5mkm/whats_new_in_octoparse_854/
8/2/2022 7:54:01 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What's new in Octoparse 8.5.4?
False
1
we5mkm
0
32057
3
3
1
2.17391304347826
1
2.17391304347826
0
0
17
36.9565217391304
46
Red
10
Dash Dot Dot
20
No
1068
Posted
5/27/2022 9:34:53 AM
Hi friends! Talk about Yelp scraping, most people try to gather local business data such as the business name, contact number, website, business hours and so on. Today, we are going to show you how to scrape yelp business data by using Octoparse within easy steps.
[https://youtu.be/9UBhUQhJTGE](https://youtu.be/9UBhUQhJTGE)
uyuiq6
u_Octoparseideas
Octoparseideas
t3_uyuiq6
https://www.reddit.com/r/u_Octoparseideas/comments/uyuiq6/how_to_scrape_yelp_business_data/
5/27/2022 9:34:53 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Yelp business data
False
1
uyuiq6
0
32057
3
3
1
1.85185185185185
0
0
0
0
34
62.962962962963
54
Red
10
Dash Dot Dot
20
No
1067
Posted
9/17/2021 3:20:05 AM
[http://www.dataextraction.io/?p=1129/?red=](http://www.dataextraction.io/?p=1129/?red=)
Businesses across the globe are on a constant hunt for talent. Often, this search for skilled candidates is full of unwanted bumps, lumps, and protuberance. To eliminate this friction recruitment executives turn towards job and career portals. As per [this](https://www.jobboarddoctor.com/wp-content/uploads/2016/05/2016-Job-Board-Survey-Final-Report.pdf) report, ***20% of the total hirings happen through job portals***. Also, 20% of the job portals are new entrants i.e., less than 2 years old.
With the rise of SaaS, PaaS, and other ready-to-market pre-packaged software tools, the entry barriers to this section of the recruitment industry have significantly reduced. But there has also been a spike in the number of job boards that fail within 3 years.
So, **how to build a successful job board website?** In this article, we explain how you can create a successful, profitable and sustainable ***“niche job board website”***.
# What Are Niche Job Boards?
Niche job boards are industry and/or location-specific job aggregator websites. They target a specific domain or industry vertical. In short, it’s a **segmented matchmaking platform for jobs & jobseekers, employees & employers, talent & opportunities from a particular industry or geography**. As per research, more than 40% of job boards are functioning with only 1–5 employees. It’s easy and affordable to start a niche job board.
# What Are The Secrets Of Building Successful Niche Job Boards?
[Healthcarejobsite](https://www.healthcarejobsite.com/), [efinancialcareers](https://www.efinancialcareers.com/), [allretailjobs](https://www.allretailjobs.com/) are some of the successful niche-focused websites for healthcare, finance, and retail industries respectively. There are thousands of specialized job boards tailored to specific industries and domains. But only a few are popular. We list some of the **strategies that have worked for successful Niche Job Boards** –
* Automated Job Aggregation using web scraping
* Extensive Database of job postings & candidates
* Focused Industrial Domain
* Focused Demography, Region
* Collaboration & *Relationship Building* with companies
* Multi-channel Monetization Models, Affiliate, Membership, Pay per click, Pay per post, Promotional posts
* Resume Building Templates For Candidates
* Efficient & Robust Matchmaking Algorithm
* Online Video Interviews & skill validation
* Review & Testimonials Feature For Funneling Right Candidates & Companies
* Salary Calculators & Career Growth Resources
# How To Build A Niche Job Board Website?
**Step 1:** Finding Your Niche For The Job Aggregator
**Step 2:** Decide On The Technological Stack Esp., **Job Scraping** Mechanism
**Step 3:** Build Your Job board In-House or Outsource
**Step 4:** Marketing & Launch
**Step 5:** Continuously **Aggregate Job postings**, Collaborate With Companies, Engage Candidates and Iterative Improve Your Platform
# How To Find The Right Niche For Your Job Board?
This is one of the critical decision-making steps for starting a job board. Factor in the following to choose the best-suited niche for your job board:
* **Your Expertise & Exposure To The Industry**
It can be very beneficial for an entrepreneur to start in the domain they have experience and expertise. You can cash in on your existing industrial contacts and associations.
For example, ReactJs is a highly popular front-end web development technology with 170k+ stars on GitHub. More and more companies are hunting for employees adept in ReactJs. Starting a React-focused niche job board could be rewarding. Who knows?
* **Industry, Demand & Market Trends**
You’ve to take into consideration future growth aspects of the industry in which you decide to startup your niche job board.
For example, Covid forced companies to go remote temporarily. This boosted the trend to go permanently remote. Companies will be looking to hire remote workers. This rising trend translates into higher demand for niche remote job sites.
* **Competitive Landscape**
Some of the industrial niches in the recruitment field are already very saturated. So, you as an entrepreneur evaluating niches for job boards must contemplate the competition. Based on the analysis, you may bring in attractive services, features, and business models to beat the competition.
Niche job boards can be segmented based on the hierarchy, seniority level, specialization, demography, etc.
* **Domain Specialization**: Every day several new startups are launched. Founders might not be always “jack of all trades”. Even if they are, businesses need specialized skills & experience in marketing, sales, tech, legal, etc., to propel on the growth roadmap. So, job boards can be specifically built to serve CMOs, CTOs, Directors, Legal advisors & Promoters.
[LucasGroup.com](http://lucasgroup.com/), [datajobs.com](http://datajobs.com/), [jobsfordevops.com](http://jobsfordevops.com/) are some of the high niche-focused dedicated tech job boards.
* **Level of skill**: The right mix of experience & skill is always necessary to balance the growth, cultural and financial wheels of any enterprise. You may consider focusing on junior-level executives, intermediate-level workers, or senior-level, high-caliber, high-performing employees.
[AllExecutiveJobs](https://www.allexecutivejobs.com/) is a niche website for senior-level professionals in Europe.
* **Demography & Region**: What suits Asians, may or may not suit Europeans or say Australians and Americans. The world changes every two KMs. We are different and so are our expectations from work. Niche Job boards can be specifically personalized for particular demography or say people in a city, country, continent.
[JobsInMilan.com](http://jobsinmilan.com/) is a geographical niche job website catering to employees in Milan city of Italy.
# How To Choose The Right Tech Stack For Your Job Board?
* **Web Scraping Job Postings** From Other Job Portals, Career Sites, and enterprise ATSs
You can either use **automated data extraction website crawlers like Octoparse** or you may do manual scraping to collect job postings from websites. Using automated solutions, you can scrape millions of postings daily and the solution is very scalable. It’s not possible to manually scrape that huge number iteratively daily.
* **Website Development —** Frontend, Backend, Database, Cloud Service
There are a plethora of options to choose from. You can consider javascript frameworks like Bootstrap, React, Angular, Vue, etc., for coding the frontend of your website. For backend, you may evaluate nodeJs, Django-python, java, python, PHP, etc.,
If you choose to go with a framework, you may consider WordPress, Drupal, Joomla for developing the website. But using these platforms can introduce several bottlenecks and lock-ins which are not good for building a scalable website.
For Cloud services i.e., hosting your website application, you may go with any of AWS, GCP, or Azure. Cloud delivers agility and cost-effectiveness.
# Use Octoparse for Web Scraping Job Postings At Scale
You can **scrape millions of jobs from thousands of career websites in no time** using an automated job scraping solution like Octoparse. One of the major factors that distinguish a successful job aggregator business from the other 80% which close in the first five years is the quantity and quality of aggregated job postings on your platform.
You can even **integrate the Octoparse web scraping tool with all the major applicant tracking systems (ATS)** enterprises use to post jobs.
Using visual scraper tools availed by Octoparse, web scraping job postings become a breeze. Besides, it’s **robust, ridiculously fast, time-efficient, and cost-effective**. You can run the crawler locally or in the cloud-based on your requirements.
Here we demonstrate how to scrape Amazon career sites using Octoparse. Do read the insight.
Now, we shall demonstrate scraping one of the popular hospital job aggregator portals, CareerVitals. We shall scrape Surgeon Jo...
pps9yr
u_Octoparseideas
Octoparseideas
t3_pps9yr
https://www.reddit.com/r/u_Octoparseideas/comments/pps9yr/how_to_develop_and_grow_your_niche_job_board/
9/17/2021 3:20:05 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How To Develop And Grow Your Niche Job Board Aggregator Websites?
False
1
pps9yr
0
32057
3
3
43
3.45937248592116
10
0.80450522928399
0
0
691
55.5913113435237
1243
Red
10
Dash Dot Dot
20
No
1066
Posted
9/30/2021 7:16:14 AM
Nowadays companies use an ATS (Application Tracking System) to post jobs and find the perfect candidates. But in a competitive industry like healthcare, candidates won’t necessarily come to your company’s career site and search for jobs. HR personnel would then copy their job lists and go to different career websites and post them. However, there are few problems with this manual approach. First, it requires a lot of manual work. When it comes to copy and paste for different career websites, it often involves clerical error. As a result, candidates get confused and give a listing a pass. Second, your posts on different sites won’t be updated. What it means is that once you change little details on the job description. You will have to manually go back and amend each post.
As a result, it is not hard to imagine even if you have good company, you don’t have the best fishing net to capture all the best candidates. However, there is an excellent tool that saves HR a ton of time as well as sourcing as many good candidates as possible from different career sites. A scraping tool like Octoparse is the answer to creating an automated job board that integrates with a company’s ATS and pushing the latest listings to different career sites.
Breathing is automated. It makes our life so easy. We realize its importance, even more, when we see people going on ventilators. But are your HR operations and processes automated? Is your job creation, listing, interviewing, and hiring process fully optimized?
We ask these questions as we see a lot of companies are unnecessarily struggling. Several enterprises are still manually posting their vacancies to different job boards. We at Octoparse believe it’s quite orthodox in the age of AI & data science.
“The world is changing whether you like it or not. Get involved or get left behind.” — Dave Waters
And so, in this article, we emphasize the importance of automating some of the cumbersome HR processes & operations. We shall demonstrate the use of Web crawling to scrape job listings from career sites. Then, we shall post scraped job data automatically to job boards of your choice using XML job feeds.
We provide you with a seamless job wrapping solution stack consisting of an Applicant tracking system (ATS), Octoparse Scraping services, and XML feeds. This would be easily integrable into your existing HR tech stack.
https://preview.redd.it/20kgbhmiclq71.png?width=512&format=png&auto=webp&v=enabled&s=0dfa0fee03af964ac248622dfe695a51c471477a
Not to mention, this relieves the HR team from manual work and enables them to focus on more critical aspects of the hiring cycle i.e., creating a perfect job description, training & onboarding the talent, etc.
[Keep reading: What is an Application Tracking System (ATS)?](http://www.dataextraction.io/?p=1135/?re=)
pyeobl
u_Octoparseideas
Octoparseideas
t3_pyeobl
https://www.reddit.com/r/u_Octoparseideas/comments/pyeobl/automate_job_feed_scraping_posting_to_scaleup/
9/30/2021 7:16:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Automate Job Feed Scraping & Posting To Scale-Up Your Business
False
1
pyeobl
0
32057
3
3
16
3.470715835141
7
1.51843817787419
0
0
233
50.5422993492408
461
Red
10
Dash Dot Dot
20
No
1065
Posted
3/30/2022 9:07:53 AM
[https://www.octoparse.com/blog/what-do-you-know-about-a-screen-scraper/?re=](https://www.octoparse.com/blog/what-do-you-know-about-a-screen-scraper/?re=)
Screen scraping is a data collecting technique usually used to copy information that shows on a digital display so it can be used for another purpose. In this article, we will introduce the process of screen scraping and how a screen scraper works.
https://preview.redd.it/9u5tfm9olhq81.jpg?width=1080&format=pjpg&auto=webp&v=enabled&s=3fd2d3cd65846ff7c004cd75ecb4d76b0a382a3c
ts29vv
u_Octoparseideas
Octoparseideas
t3_ts29vv
https://www.reddit.com/r/u_Octoparseideas/comments/ts29vv/what_is_screen_scraping_and_how_does_it_work/
3/30/2022 9:07:53 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What Is Screen Scraping and How Does It Work?
False
1
ts29vv
0
32057
3
3
1
1.40845070422535
0
0
0
0
34
47.887323943662
71
Red
10
Dash Dot Dot
20
No
1064
Posted
11/11/2021 2:59:29 AM
Octoparse salutes Black Friday: [https://youtu.be/En7MS6lo8WQ](https://youtu.be/En7MS6lo8WQ)
This Black Friday with Octoparse! Lower price and new version!
Save up 30-40% from 11.17-12.03 2021 (23:59 EST.)
Extra 10-15% OFF on the first day 11.17th EST ONLY
And get free giveaways: crawler+training
New 8.4 new experience:
8.4 updates with cool new features: Add custom user agent, page scroll-down, Zapier integration
Faster engine, more intuitive layout, and robust exportation
Tune in Octoparse for the coming Black Friday and save big!
qrb7e4
u_Octoparseideas
Octoparseideas
t3_qrb7e4
https://www.reddit.com/r/u_Octoparseideas/comments/qrb7e4/octoparse_salutes_black_friday/
11/11/2021 2:59:29 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse salutes Black Friday
False
1
qrb7e4
0
32057
3
3
5
5.37634408602151
0
0
0
0
56
60.2150537634409
93
Red
10
Dash Dot Dot
20
No
1063
Posted
12/1/2021 2:28:49 AM
[https://youtu.be/aEh1coudY9s](https://youtu.be/aEh1coudY9s)
In this tutorial, I’ll show you how to use web scraping templates in Octoparse 8.4 to extract Amazon product reviews in 3 easy steps.
The ready-to-use template is a unique feature of Octoparse. They are prebuilt crawlers that can be used to scrape popular websites such as Amazon, Facebook and many more. Since all the data fields are pre-set, there’s no need to configure the crawlers by yourself. Simply enter the search value and it will fetch the data for you right away.
The Amazon Template we picked in this video need to gather ASINs, review date time for gathering the rating, review date, review text of the product.
The sample output gives you an idea of what the end result will look like when it completes. We should be able to get all the information in a nice and structured format!
r643r3
Octoparse_ideas
Octoparseideas
t3_r643r3
https://www.reddit.com/r/Octoparse_ideas/comments/r643r3/how_to_scrape_amazon_product_reviews_in_three/
12/1/2021 2:28:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Amazon product reviews in three easy steps
False
1
r643r3
0
32057
3
3
5
3.20512820512821
0
0
0
0
75
48.0769230769231
156
Red
10
Dash Dot Dot
20
No
1062
Posted
6/24/2022 8:35:48 AM
The online job market has undoubtedly overridden in-person hiring activities. This is especially true when most cities around the globe face rounds of lock-down and more jobs shift to a remote mode since COVID-19. In this sense, scraping job postings data helps not only institutions and organizations but also individual job seekers.
[https://www.octoparse.com/blog/web-scraping-job-postings/?utm\_source=sale2022&utm\_medium=webscrapingjobpostings&utm\_campaign=reddit](https://www.octoparse.com/blog/web-scraping-job-postings/?utm_source=sale2022&utm_medium=webscrapingjobpostings&utm_campaign=reddit)
vjjo9r
u_Octoparseideas
Octoparseideas
t3_vjjo9r
https://www.reddit.com/r/u_Octoparseideas/comments/vjjo9r/a_complete_guide_to_web_scraping_job_postings/
6/24/2022 8:35:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
A Complete Guide to Web Scraping Job Postings
False
1
vjjo9r
0
32057
3
3
0
0
0
0
0
0
62
67.3913043478261
92
Red
10
Dash Dot Dot
20
No
1061
Posted
11/11/2021 3:01:01 AM
Octoparse salutes Black Friday: [https://youtu.be/En7MS6lo8WQ](https://youtu.be/En7MS6lo8WQ)
This Black Friday with Octoparse! Lower price and new version!
Save up 30-40% from 11.17-12.03 2021 (23:59 EST.)
Extra 10-15% OFF on the first day 11.17th EST ONLY
And get free giveaways: crawler+training
New 8.4 new experience:
8.4 updates with cool new features: Add custom user agent, page scroll-down, Zapier integration
Faster engine, more intuitive layout, and robust exportation
Tune in Octoparse for the coming Black Friday and save big!
qrb8lc
Octoparse_ideas
Octoparseideas
t3_qrb8lc
https://www.reddit.com/r/Octoparse_ideas/comments/qrb8lc/octoparse_salutes_black_friday/
11/11/2021 3:01:01 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse salutes Black Friday
False
1
qrb8lc
0
32057
3
3
5
5.37634408602151
0
0
0
0
56
60.2150537634409
93
Red
10
Dash Dot Dot
20
No
1060
Posted
5/27/2022 9:35:39 AM
Hi friends! Talk about Yelp scraping, most people try to gather local business data such as the business name, contact number, website, business hours and so on. Today, we are going to show you how to scrape yelp business data by using Octoparse within easy steps.
[https://youtu.be/9UBhUQhJTGE](https://youtu.be/9UBhUQhJTGE)
uyuj4t
Octoparse_ideas
Octoparseideas
t3_uyuj4t
https://www.reddit.com/r/Octoparse_ideas/comments/uyuj4t/how_to_scrape_yelp_business_data/
5/27/2022 9:35:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Yelp business data
False
1
uyuj4t
0
32057
3
3
1
1.85185185185185
0
0
0
0
34
62.962962962963
54
Red
10
Dash Dot Dot
20
No
1059
Posted
9/2/2022 6:49:16 AM
Web scraping makes it possible to gather and store a lot of data quickly. This article will assist you in scraping customer evaluations from Trustpilot, an online review platform.
[https://www.octoparse.com/blog/how-to-scrape-trustpilot/?utm\_source=2022q3&utm\_medium=how-to-scrape-trustpilot&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-scrape-trustpilot/?utm_source=2022q3&utm_medium=how-to-scrape-trustpilot&utm_campaign=reddit)
x3umdk
Octoparse_ideas
Octoparseideas
t3_x3umdk
https://www.reddit.com/r/Octoparse_ideas/comments/x3umdk/best_trustpilot_scraper_to_get_data_from/
9/2/2022 6:49:16 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Best Trustpilot Scraper to Get Data from Trustpilot Easily
False
1
x3umdk
0
32057
3
3
0
0
0
0
0
0
43
59.7222222222222
72
Red
10
Dash Dot Dot
20
No
1058
Posted
3/28/2022 8:47:49 AM
[https://www.octoparse.com/blog/what-is-a-web-crawler-and-how-does-it-work-at-your-benefit/?re=](https://www.octoparse.com/blog/what-is-a-web-crawler-and-how-does-it-work-at-your-benefit/?re=)
A web crawler, also known as a web spider or search engine bot, is a bot that visits and indexes the content of web pages all over the Internet. In this article, we will guide you through the way showing how web crawlers work and the differences between web crawling and web scraping.
https://preview.redd.it/v85twmv983q81.jpg?width=800&format=pjpg&auto=webp&v=enabled&s=202735f01da0bd6b21599569fd36090130cacf86
tq4w2g
u_Octoparseideas
Octoparseideas
t3_tq4w2g
https://www.reddit.com/r/u_Octoparseideas/comments/tq4w2g/what_is_a_web_crawler_and_how_does_it_work/
3/28/2022 8:47:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What Is a Web Crawler and How Does It Work
False
1
tq4w2g
0
32057
3
3
5
5.49450549450549
0
0
0
0
37
40.6593406593407
91
Red
10
Dash Dot Dot
20
No
1057
Posted
7/18/2022 8:30:02 AM
[https://youtu.be/8U3-VLSp3vA](https://youtu.be/8U3-VLSp3vA)
Web scraping itself is not illegal. Although scraping public data is now ruled legal according to a U.S. appeals court ruling, for people who want to give it a shot, it is still important to identify the risks around web scraping practices. Check out Octoparse's new video to learn if web scraping is legal and why.
w1u5vx
Octoparse_ideas
Octoparseideas
t3_w1u5vx
https://www.reddit.com/r/Octoparse_ideas/comments/w1u5vx/is_web_scraping_legal_and_why/
7/18/2022 8:30:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Is Web Scraping Legal and Why?
False
1
w1u5vx
0
32057
3
3
1
1.49253731343284
2
2.98507462686567
0
0
35
52.2388059701493
67
Red
10
Dash Dot Dot
20
No
1056
Posted
5/18/2022 3:37:02 AM
How much do you know about web scraping? Don't worry even if you are new to this concept. As in this article, we will brief you on the basics of web scraping, teach you how to assess web scraping tools to get one that best fits your needs, and last but not least, present a list of web scraping tools for your reference.
[https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss/?utm\_source=sale2022&utm\_medium=10freewebscrapers&utm\_campaign=reddit](https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss/?utm_source=sale2022&utm_medium=10freewebscrapers&utm_campaign=reddit)
us3w6u
Octoparse_ideas
Octoparseideas
t3_us3w6u
https://www.reddit.com/r/Octoparse_ideas/comments/us3w6u/10_free_web_scrapers_that_you_cannot_miss_in_2022/
5/18/2022 3:37:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10 FREE Web Scrapers That You Cannot Miss in 2022
False
1
us3w6u
0
32057
3
3
3
2.77777777777778
3
2.77777777777778
0
0
48
44.4444444444444
108
Red
10
Dash Dot Dot
20
No
1055
Posted
3/29/2022 9:06:25 AM
Welcome to Octoparse new training session videos: Parse with Octoparse in 3 minutes.
In this session, we will walk you through how to use Octoparse in 16 lessons in 5 parts, which are introduction, basics, intermediate use, advanced use and troubleshooting.
Find out lesson 1-8 below:
[https://youtu.be/jn8Ue15RVPQ](https://youtu.be/jn8Ue15RVPQ)
[https://youtu.be/QlJst0SfUG8](https://youtu.be/QlJst0SfUG8)
[https://youtu.be/o7RNlQdQIqM](https://youtu.be/o7RNlQdQIqM)
[https://youtu.be/1U8kxcWXQSI](https://youtu.be/1U8kxcWXQSI)
[https://youtu.be/XQYYF\_i16kM](https://youtu.be/XQYYF_i16kM)
[https://youtu.be/vzSE-SL-l4s](https://youtu.be/vzSE-SL-l4s)
[https://youtu.be/gyDV\_4Jeq\_I](https://youtu.be/gyDV_4Jeq_I)
[https://youtu.be/\_dQKYS9zXzc](https://youtu.be/_dQKYS9zXzc)
tqvmhp
Octoparse_ideas
Octoparseideas
t3_tqvmhp
https://www.reddit.com/r/Octoparse_ideas/comments/tqvmhp/parse_with_octoparse_in_3_minutes/
3/29/2022 9:06:25 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Parse with Octoparse in 3 minutes
False
1
tqvmhp
0
32057
3
3
2
1.69491525423729
0
0
0
0
62
52.5423728813559
118
Red
10
Dash Dot Dot
20
No
1054
Posted
12/17/2021 4:06:42 AM
[removed]
ri8lh5
SaaS
Octoparseideas
t3_ri8lh5
https://www.reddit.com/r/SaaS/comments/ri8lh5/what_can_octoparse_do_to_help_your_business_find/
12/17/2021 4:06:42 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What can Octoparse do to Help your Business? Find more about the no-coding SaaS product.
False
0.25
ri8lh5
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
1053
Posted
9/30/2021 7:16:14 AM
Nowadays companies use an ATS (Application Tracking System) to post jobs and find the perfect candidates. But in a competitive industry like healthcare, candidates won’t necessarily come to your company’s career site and search for jobs. HR personnel would then copy their job lists and go to different career websites and post them. However, there are few problems with this manual approach. First, it requires a lot of manual work. When it comes to copy and paste for different career websites, it often involves clerical error. As a result, candidates get confused and give a listing a pass. Second, your posts on different sites won’t be updated. What it means is that once you change little details on the job description. You will have to manually go back and amend each post.
As a result, it is not hard to imagine even if you have good company, you don’t have the best fishing net to capture all the best candidates. However, there is an excellent tool that saves HR a ton of time as well as sourcing as many good candidates as possible from different career sites. A scraping tool like Octoparse is the answer to creating an automated job board that integrates with a company’s ATS and pushing the latest listings to different career sites.
Breathing is automated. It makes our life so easy. We realize its importance, even more, when we see people going on ventilators. But are your HR operations and processes automated? Is your job creation, listing, interviewing, and hiring process fully optimized?
We ask these questions as we see a lot of companies are unnecessarily struggling. Several enterprises are still manually posting their vacancies to different job boards. We at Octoparse believe it’s quite orthodox in the age of AI & data science.
“The world is changing whether you like it or not. Get involved or get left behind.” — Dave Waters
And so, in this article, we emphasize the importance of automating some of the cumbersome HR processes & operations. We shall demonstrate the use of Web crawling to scrape job listings from career sites. Then, we shall post scraped job data automatically to job boards of your choice using XML job feeds.
We provide you with a seamless job wrapping solution stack consisting of an Applicant tracking system (ATS), Octoparse Scraping services, and XML feeds. This would be easily integrable into your existing HR tech stack.
https://preview.redd.it/20kgbhmiclq71.png?width=512&format=png&auto=webp&v=enabled&s=0dfa0fee03af964ac248622dfe695a51c471477a
Not to mention, this relieves the HR team from manual work and enables them to focus on more critical aspects of the hiring cycle i.e., creating a perfect job description, training & onboarding the talent, etc.
[Keep reading: What is an Application Tracking System (ATS)?](http://www.dataextraction.io/?p=1135/?re=)
pyeobl
u_Octoparseideas
Octoparseideas
t3_pyeobl
https://www.reddit.com/r/u_Octoparseideas/comments/pyeobl/automate_job_feed_scraping_posting_to_scaleup/
9/30/2021 7:16:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Automate Job Feed Scraping & Posting To Scale-Up Your Business
False
1
pyeobl
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1052
Posted
2/17/2022 7:50:13 AM
🤔When using Octoparse, do you prefer local running or cloud running? Why?
🤗Make the choice, comment with your reasons, and win a 1-month free extension!
[View Poll](https://www.reddit.com/poll/sujbvq)
sujbvq
Octoparse_ideas
Octoparseideas
t3_sujbvq
https://www.reddit.com/r/Octoparse_ideas/comments/sujbvq/local_scraping_vs_cloud_scraping/
2/17/2022 7:50:13 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Local scraping vs. Cloud scraping
False
1
sujbvq
0
32057
3
3
3
8.82352941176471
1
2.94117647058824
0
0
16
47.0588235294118
34
Red
10
Dash Dot Dot
20
No
1051
Posted
6/24/2021 3:33:04 AM
[removed]
o6sowz
webscraping
Octoparseideas
t3_o6sowz
https://www.reddit.com/r/webscraping/comments/o6sowz/scrape_yahoo_finance_to_extract_stock_prices/
6/24/2021 3:33:04 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape Yahoo! Finance to Extract Stock Prices
False
0.33
o6sowz
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
1050
Posted
4/6/2022 8:46:48 AM
How to rank on the first page? In this article, you will find out how to achieve SEO success through keyword research, backlink research, and LCP improvement.
[https://www.octoparse.com/blog/a-free-and-easy-way-to-improve-google-ranking/?re=](https://www.octoparse.com/blog/a-free-and-easy-way-to-improve-google-ranking/?re=)
txhmcr
Octoparse_ideas
Octoparseideas
t3_txhmcr
https://www.reddit.com/r/Octoparse_ideas/comments/txhmcr/how_to_improve_google_ranking_for_free_in_2022/
4/6/2022 8:46:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Improve Google Ranking for Free in 2022
False
1
txhmcr
0
32057
3
3
8
14.0350877192982
0
0
0
0
23
40.3508771929825
57
Red
10
Dash Dot Dot
20
No
1049
Posted
11/29/2021 7:33:34 AM
[http://www.dataextraction.io/?p=1167/?re=](http://www.dataextraction.io/?p=1167/?re=)
Content creators are always stressed out to come up with new ideas. Regular updates are essential. While if you have been focused on a niche for some time, it’s natural to become numb on the topic.
You need more perspectives and new triggers to open up your mind. A web scraping tool and a list of valuable sources could be extremely helpful in this case.
# Who Will Benefit from This Article
* Youtube creators looking for new topics to make videos
* Blog writers seeking new ideas to write
* Content marketers wanting to source ideas efficiently
* Inbound marketing professionals looking for SEO-friendly topics
Modern people are flooded with opinions and content every day. How do we generate ideas from them?
Reading is not practical. We need a structured way to spy, analyze and extract the values out of them. A web scraping tool can help build the database where writers/creators could filter through and get inspired.
On top of all the places where you can source ideas, I may also present how web scraping can improve the process.
# 7 Ways to Curate Content Ideas
1. Youtube Channel Crawler
2. Study Google SERP
3. Listen to social media discussions
4. Spy on Competitors’ Blog
5. Idea generator
6. Keyword planning tools (Semrush/Ahrefs)
7. Look back at what you have created
# Youtube Channel Crawler
A few months ago, I shared a post about how to use a [Youtube channel crawler](https://www.octoparse.com/blog/youtube-channel-crawler) to review your channel and spy on competitors’.
[https://youtu.be/oJtawA\_0bHI](https://youtu.be/oJtawA_0bHI)
In this way, you can grab topic data (titles, tags, description) and performance data (views, tags, likes) of published videos. Filtering through the data, you will know what you shall focus on: the well-performing topics that your channel has never touched.
Always keep up with the enemy.
A topic performs well on another channel and you haven’t made any out of it yet? That’s potential. Spot the gap and fill it with a video better than any of the existing.
Besides, comments under the video left by the viewers are important signals too, through which you may evaluate how intriguing the topic is to your target audience and learn what you can improve to rank over your enemy.
## Study Google SERP
A Google SERP is helpful if you want to generate SEO-performing content.
* Top-ranking articles on the result page tell us what people are looking for. Always match your content with the need of your audience.
* Title and description (and the query you are searching for) shed light on what topics are welcome and what words or phrases can be utilized as SEO-friendly keywords.
[Google SERP Crawler Works for SEO Benefits](https://preview.redd.it/m80wmom7mh281.png?width=1024&format=png&auto=webp&v=enabled&s=ba6b251e5632e4320817329e409cb65a926a1871)
Use your general idea as the query to search on Google and the search results help you niche down to a more targeted scope.
## Listen to Social Media Discussions
Posts discussing how to source content ideas won’t miss out on social media platforms, Youtube, Instagram, Reddit, just to name a few. That’s true, they are valuable and resourceful. Conversations reveal what people think and care about.
Social media and Q&A platforms to source ideas:
* Twitter
* Instagram
* Facebook
* Youtube
* Medium
* Reddit
* Quora
* Niche communities (eg. Stackoverflow for techies)
However, the key is to find a place where your target audience gathers. Use hashtags, groups, subreddits, channels, and direct search to filter out the insignificant noises and focus on the voices of your potential audience.
## Spy on Competitors’ Blog
If you are a Youtuber, keep a close eye on your competitor’s channel. Same for a blogger. And do not limit yourself to “rival competitors”, any websites or channels that share a group of the audience are worth attention. They help us jump out of the box and see what extra values we can offer to our audience.
## Idea Generator
There are web-based tools designed for creators to find related topics. Most of them use algorithms to put recurring words into certain patterns and generate questions or titles for writers to write on. The titles/ideas offered may not be the perfect fit to use directly, while it can broaden your mind sometimes.
\>>Question generator: [AnswerthePublic](https://answerthepublic.com/)
\>>Title generator: [HubspotTopic](https://www.hubspot.com/blog-topic-generator)
## Keywords Planning Tool
The keyword is so SEO-oriented that it is always mistaken as magic for high traffic. Instead, keywords and their popularity do not guarantee traffic but work as a signal to tell you what people are seeking. Hence, a keywords planning tool is an instrument for content marketers to find out customer needs.
A good content idea is not necessarily new, original, or fancy, but it must meet the real needs of your audience.
[SEMrush](https://www.semrush.com/lp/sem/en) and [Ahrefs](https://ahrefs.com/) are good tools to explore the keywords, find out what people are looking for on the Internet. There are keyword tools tailored for Youtubers like [Socialblade](https://socialblade.com/), [KeywordTool](https://keywordtool.io/youtube).
## Look Back at What You Have Created
A super influencer with 1 million followers may not care much about SEO stuff. The new release will be dispensed to his/her followers anyway. Social sharing plays a big role and a good ranking is just a result.
A growing channel is different. SEO matters a lot. Analyzing what you have published and how they performed is important to understand who your audience is and what kind of content attracts them.
If you are running a blog, make sure to set up [Google Analytics](https://analytics.google.com/analytics/web/) and [Google Search Console](https://search.google.com/search-console/about) to monitor the performance of your posts and make adjustments accordingly.
# Conclusions
You may source content ideas as an observer, browsing the existing posts, reviews, blogs, and videos. Make sure the quality and quantity of your input do matter. On the other hand, creating opportunities to have direct interaction with your audience like holding an interview, starting a live conversation, may inspire you in a new way.
r4qgt6
u_Octoparseideas
Octoparseideas
t3_r4qgt6
https://www.reddit.com/r/u_Octoparseideas/comments/r4qgt6/7_ways_to_find_content_ideas_from_the_web_and/
11/29/2021 7:33:34 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
7 Ways to Find Content Ideas from the Web and Create Like a Pro
False
1
r4qgt6
0
32057
3
3
36
3.38983050847458
9
0.847457627118644
0
0
530
49.9058380414313
1062
Red
10
Dash Dot Dot
20
No
1048
Posted
9/14/2022 3:02:04 AM
If you want to monitor your website's ranking on Google, analyze your competitors, or analyze paid ads on Google, then scraping the search results is the best way to get started.
In this article, we are going to learn about 3 different ways that you can use to scrape Google Search Results.
[https://www.octoparse.com/blog/scrape-google-search-results/?utm\_source=2022q3&utm\_medium=scrape-google-search-results&utm\_campaign=reddit](https://www.octoparse.com/blog/scrape-google-search-results/?utm_source=2022q3&utm_medium=scrape-google-search-results&utm_campaign=reddit)
xdqoci
u_Octoparseideas
Octoparseideas
t3_xdqoci
https://www.reddit.com/r/u_Octoparseideas/comments/xdqoci/3_easy_ways_to_scrape_google_search_results/
9/14/2022 3:02:04 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
3 Easy Ways to Scrape Google Search Results
False
1
xdqoci
0
32057
3
3
1
1.05263157894737
0
0
0
0
58
61.0526315789474
95
Red
10
Dash Dot Dot
20
No
1047
Posted
6/24/2021 3:33:04 AM
[removed]
o6sowz
webscraping
Octoparseideas
t3_o6sowz
https://www.reddit.com/r/webscraping/comments/o6sowz/scrape_yahoo_finance_to_extract_stock_prices/
6/24/2021 3:33:04 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape Yahoo! Finance to Extract Stock Prices
False
0.33
o6sowz
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1046
Posted
8/24/2022 3:50:50 AM
Are you looking for a Youtube comment scraper for sentiment analysis? If yes, then you have come to the right place. This article helps you to scrape Youtube comments using the two easiest methods.
[https://www.octoparse.com/blog/youtube-comment-scraper/?utm\_source=2022q3&utm\_medium=youtube-comment-scraper&utm\_campaign=reddit](https://www.octoparse.com/blog/youtube-comment-scraper/?utm_source=2022q3&utm_medium=youtube-comment-scraper&utm_campaign=reddit)
ww93pc
u_Octoparseideas
Octoparseideas
t3_ww93pc
https://www.reddit.com/r/u_Octoparseideas/comments/ww93pc/scrape_youtube_comments_for_sentiment_analysis/
8/24/2022 3:50:50 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape YouTube Comments for Sentiment Analysis
False
1
ww93pc
0
32057
3
3
2
2.73972602739726
0
0
0
0
46
63.013698630137
73
Red
10
Dash Dot Dot
20
No
1045
Posted
11/5/2021 8:55:50 AM
[removed]
qn6vnm
Octoparse_ideas
Octoparseideas
t3_qn6vnm
https://www.reddit.com/r/Octoparse_ideas/comments/qn6vnm/extract_emails_from_any_website_for_cold_email/
11/5/2021 8:55:50 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Extract Emails from Any Website for Cold Email Marketing
False
1
qn6vnm
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
1044
Posted
9/6/2022 3:49:12 AM
Real estate scraping is now a thriving method to analyze potential buyers, consumer needs, optimized prices, and a bulk of other online information. Do you need to retrieve such info to improve your real estate business? Then this article will surely come in handy in this regard.
[https://www.octoparse.com/blog/real-estate-scraping/?utm\_source=2022q3&utm\_medium=real-estate-scraping&utm\_campaign=reddit](https://www.octoparse.com/blog/real-estate-scraping/?utm_source=2022q3&utm_medium=real-estate-scraping&utm_campaign=reddit)
x707y9
Octoparse_ideas
Octoparseideas
t3_x707y9
https://www.reddit.com/r/Octoparse_ideas/comments/x707y9/how_can_real_estate_scraping_improve_your_business/
9/6/2022 3:49:12 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How Can Real Estate Scraping Improve Your Business
False
1
x707y9
0
32057
3
3
4
4.65116279069767
0
0
0
0
47
54.6511627906977
86
Red
10
Dash Dot Dot
20
No
1043
Posted
10/19/2021 2:40:46 AM
Octoparse users, how's your journey with the product?
Share your story with Octoparse with hashtag "[\#OctoparseinYourArea](https://www.linkedin.com/feed/hashtag/?keywords=octoparseinyourarea&highlightedUpdateUrns=urn%3Ali%3Aactivity%3A6856055699533918208)" on Facebook / Twitter / LinkedIn, and win the FREE gift pack of all scrapers created in the video series "Learn from Community"([https://hubs.la/H0ZCX\_j0](https://hubs.la/H0ZCX_j0))!
Come and participate in the event!
qb24kr
u_Octoparseideas
Octoparseideas
t3_qb24kr
https://www.reddit.com/r/u_Octoparseideas/comments/qb24kr/event_octoparseinyourarea/
10/19/2021 2:40:46 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Event - #OctoparseinYourArea
False
1
qb24kr
0
32057
3
3
2
3.03030303030303
0
0
0
0
38
57.5757575757576
66
Red
10
Dash Dot Dot
20
No
1042
Posted
5/30/2022 9:07:42 AM
Nowadays people use PDF on a large scale for reading, presenting, and many other purposes. And many websites store data in a PDF file for viewers to download instead of posting on the web pages, which brings challenges to web scraping. You can view, save and print PDF files with ease. But the problem is, PDF is designed to keep the integrity of the file. It is more like an "electronic paper" format to make sure contents would look the same on any computer at any time. So it is difficult to edit a PDF file and export data from it.
Fortunately, there are some solutions that help extract data from PDF into Excel.
[https://www.octoparse.com/blog/how-to-extract-pdf-into-excel/?utm\_source=sale2022&utm\_medium=extractdatafrompdf&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-extract-pdf-into-excel/?utm_source=sale2022&utm_medium=extractdatafrompdf&utm_campaign=reddit)
v0w3f5
Octoparse_ideas
Octoparseideas
t3_v0w3f5
https://www.reddit.com/r/Octoparse_ideas/comments/v0w3f5/how_to_extract_data_from_pdf_to_excel_without/
5/30/2022 9:07:42 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Extract Data from PDF to Excel Without Coding skills
False
1
v0w3f5
0
32057
3
3
5
3.2258064516129
2
1.29032258064516
0
0
79
50.9677419354839
155
Red
10
Dash Dot Dot
20
No
1041
Posted
6/21/2022 2:42:14 AM
Google sheets can be regarded as a basic web scraper. You can use a special formula to extract data from websites, import the data directly to google sheets and share it with your friends. By reading the following parts, you can learn the easy methods on how to build a simple web scraper with Google Sheets.
[https://www.octoparse.com/blog/simple-web-scraping-using-google-sheets/?utm\_source=sale2022&utm\_medium=webscrapinggooglesheets&utm\_campaign=reddit](https://www.octoparse.com/blog/simple-web-scraping-using-google-sheets/?utm_source=sale2022&utm_medium=webscrapinggooglesheets&utm_campaign=reddit)
vh2rok
Octoparse_ideas
Octoparseideas
t3_vh2rok
https://www.reddit.com/r/Octoparse_ideas/comments/vh2rok/simple_web_scraping_using_google_sheets_2022/
6/21/2022 2:42:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Simple Web Scraping using Google Sheets (2022 updated)
False
1
vh2rok
0
32057
3
3
1
1.03092783505155
0
0
0
0
61
62.8865979381443
97
Red
10
Dash Dot Dot
20
No
1040
Posted
10/25/2021 4:07:41 AM
[View Poll](https://www.reddit.com/poll/qf8nqw)
qf8nqw
u_Octoparseideas
Octoparseideas
t3_qf8nqw
https://www.reddit.com/r/u_Octoparseideas/comments/qf8nqw/whats_your_favorite_new_feature_in_octoparse_842/
10/25/2021 4:07:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What's your favorite new feature in Octoparse 8.4.2 version?
False
1
qf8nqw
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1039
Posted
6/20/2022 7:38:03 AM
We all know how hard it is to build an email sales list from scratch, so we need help from email scraping tools. Email scraping can help you collect email addresses shown publicly using a bot. In this article, I profiled a list of the best 11 email scraping tools for sales prospecting. Let's take a look.
[https://www.octoparse.com/blog/best-email-scraping-tools-for-sales-prospecting-in-2019/?utm\_source=sale2022&utm\_medium=bestemailscrapingtools&utm\_campaign=reddit](https://www.octoparse.com/blog/best-email-scraping-tools-for-sales-prospecting-in-2019/?utm_source=sale2022&utm_medium=bestemailscrapingtools&utm_campaign=reddit)
vgfvvr
u_Octoparseideas
Octoparseideas
t3_vgfvvr
https://www.reddit.com/r/u_Octoparseideas/comments/vgfvvr/11_best_email_scraping_tools_for_sales/
6/20/2022 7:38:03 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
11 Best Email Scraping Tools for Sales Prospecting in 2022
False
1
vgfvvr
0
32057
3
3
3
2.88461538461538
2
1.92307692307692
0
0
63
60.5769230769231
104
Red
10
Dash Dot Dot
20
No
1038
Posted
10/25/2021 4:09:36 AM
[View Poll](https://www.reddit.com/poll/qf8otm)
qf8otm
Octoparse_ideas
Octoparseideas
t3_qf8otm
https://www.reddit.com/r/Octoparse_ideas/comments/qf8otm/whats_your_favorite_new_feature_in_octoparse_842/
10/25/2021 4:09:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What's your favorite new feature in Octoparse 8.4.2 version?
False
1
qf8otm
0
32057
3
3
0
0
0
0
0
0
5
62.5
8
Red
10
Dash Dot Dot
20
No
1037
Posted
5/27/2022 9:35:39 AM
Hi friends! Talk about Yelp scraping, most people try to gather local business data such as the business name, contact number, website, business hours and so on. Today, we are going to show you how to scrape yelp business data by using Octoparse within easy steps.
[https://youtu.be/9UBhUQhJTGE](https://youtu.be/9UBhUQhJTGE)
uyuj4t
Octoparse_ideas
Octoparseideas
t3_uyuj4t
https://www.reddit.com/r/Octoparse_ideas/comments/uyuj4t/how_to_scrape_yelp_business_data/
5/27/2022 9:35:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Yelp business data
False
1
uyuj4t
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1036
Posted
6/23/2022 9:15:29 AM
A social media scraper often refers to an automatic web scraping tool that extracts data from social media channels. In this article, I am going to further illustrate how social media datasets can be used in business and list out the top 5 social media scraping tools I recommend.
[https://www.octoparse.com/blog/top-5-social-media-scraping-tools-for-2021/?utm\_source=sale2022&utm\_medium=socialmediascrapingtools&utm\_campaign=reddit](https://www.octoparse.com/blog/top-5-social-media-scraping-tools-for-2021/?utm_source=sale2022&utm_medium=socialmediascrapingtools&utm_campaign=reddit)
visp8t
Octoparse_ideas
Octoparseideas
t3_visp8t
https://www.reddit.com/r/Octoparse_ideas/comments/visp8t/top_5_social_media_scraping_tools_for_2022/
6/23/2022 9:15:29 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 Social Media Scraping Tools for 2022
False
1
visp8t
0
32057
3
3
4
4.25531914893617
0
0
0
0
56
59.5744680851064
94
Red
10
Dash Dot Dot
20
No
1035
Posted
11/26/2021 7:04:51 AM
[https://www.octoparse.com/blog/how-freelancers-make-money-using-web-scraping/?re=](https://www.octoparse.com/blog/how-freelancers-make-money-using-web-scraping/?re=)
A few days ago, I talked to James and found his experience as a freelancer quite inspiring. He started as an excel panelist and now his clients have been all over the world. So I record the story and some real cases he had done, hoping this can give you some ideas when you want to start your freelance career on web scraping.
# What Is Web Scraping?
[Web scraping](https://www.octoparse.com/WebScraping) is a process of collecting data in an automated fashion. Generally, companies use web scraping for price monitoring, customer profiling, lead generation, targeted advertising to make smarter decisions.
# Why Become A Freelancer for Web Scraping?
Web scraping is getting more and more popular. And the demand for web scraping services is high and still rising.
Back in 2019, when James worked as a cryptocurrency excel panelist, he occasionally found out that all the raw data came from web scraping, which aroused his great interest.
“Data is the new oil, I felt like this was the market you have to jump in right now and see how it can make a profit,” he told me. That’s the time when he decided to go freelance with web scraping and finally started to earn a living on it.
# What the Freelancer Had Done with Web Scraping?
With not many words, let’s get into the topic. Usually, James works for three kinds of industry, eCommerce, Real-estate, and Marketing.
## I. Data Entry for eCommerce Sellers
r2hbwe
u_Octoparseideas
Octoparseideas
t3_r2hbwe
https://www.reddit.com/r/u_Octoparseideas/comments/r2hbwe/how_freelancers_make_money_using_web_scraping/
11/26/2021 7:04:51 AM
11/26/2021 7:09:58 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How Freelancers Make Money, Using Web Scraping
False
1
r2hbwe
0
32057
3
3
10
3.66300366300366
0
0
0
0
137
50.1831501831502
273
Red
10
Dash Dot Dot
20
No
1034
Posted
6/23/2022 3:41:44 AM
To download the image for the link, you may want to look into "Bulk Image Downloaders". Inspired by the inquiries received, I decided to make a "top 5 bulk image downloader" list for you. Be sure to check out this article if you want to download images from links at zero cost.
[https://www.octoparse.com/blog/bulk-download-images-from-links-top-5-bulk-image-downloaders/?utm\_source=sale2022&utm\_medium=top5imagedownloaders&utm\_campaign=reddit](https://www.octoparse.com/blog/bulk-download-images-from-links-top-5-bulk-image-downloaders/?utm_source=sale2022&utm_medium=top5imagedownloaders&utm_campaign=reddit)
vini81
u_Octoparseideas
Octoparseideas
t3_vini81
https://www.reddit.com/r/u_Octoparseideas/comments/vini81/top_5_image_downloaders_to_download_from_any/
6/23/2022 3:41:44 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 Image Downloaders to Download from Any Websites/URLs
False
1
vini81
0
32057
3
3
3
2.97029702970297
0
0
0
0
57
56.4356435643564
101
Red
10
Dash Dot Dot
20
No
1033
Posted
12/16/2021 2:46:08 AM
https://youtu.be/tQEy8aqyfUE
rhgh95
u_Octoparseideas
Octoparseideas
t3_rhgh95
https://www.reddit.com/r/u_Octoparseideas/comments/rhgh95/how_to_scrape_linkedin_jobs_with_octoparse/
12/16/2021 2:46:08 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape LinkedIn jobs with Octoparse
False
1
rhgh95
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1032
Posted
5/13/2022 6:47:06 AM
Octoparse is a go-to software for connecting your business to any web data and it is simple enough that you can get up and running right away, but also contains powerful advanced features that you can use to achieve scraping websites of all kinds.
[https://youtu.be/Y2ArkGbigUE](https://youtu.be/Y2ArkGbigUE)
uolq8n
u_Octoparseideas
Octoparseideas
t3_uolq8n
https://www.reddit.com/r/u_Octoparseideas/comments/uolq8n/web_scraping_solution_with_octoparse/
5/13/2022 6:47:06 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web scraping solution with Octoparse
False
1
uolq8n
0
32057
3
3
4
7.54716981132075
0
0
0
0
21
39.622641509434
53
Red
10
Dash Dot Dot
20
No
1031
Posted
9/7/2022 2:53:12 AM
Zillow is one of the most popular websites used to search for homes, check home values, and find real estate agents. It also has a lot of data about local homes, their prices, and the realtors. That’s why scraping Zillow data is great to use in your tools and third-party applications for commercial real estate needs. The data we scrape from Zillow will contain information about the list of houses for sale in any city in their database.
[https://www.octoparse.com/blog/how-to-scrape-zillow-data/?utm\_source=2022q3&utm\_medium=how-to-scrape-zillow-data&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-scrape-zillow-data/?utm_source=2022q3&utm_medium=how-to-scrape-zillow-data&utm_campaign=reddit)
x7to6n
u_Octoparseideas
Octoparseideas
t3_x7to6n
https://www.reddit.com/r/u_Octoparseideas/comments/x7to6n/zillow_scraper_scrape_zillow_real_estate_data_for/
9/7/2022 2:53:12 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Zillow Scraper: Scrape Zillow Real Estate Data for Free
False
1
x7to6n
0
32057
3
3
2
1.5748031496063
0
0
0
0
68
53.5433070866142
127
Red
10
Dash Dot Dot
20
No
1030
Posted
5/18/2022 3:37:02 AM
How much do you know about web scraping? Don't worry even if you are new to this concept. As in this article, we will brief you on the basics of web scraping, teach you how to assess web scraping tools to get one that best fits your needs, and last but not least, present a list of web scraping tools for your reference.
[https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss/?utm\_source=sale2022&utm\_medium=10freewebscrapers&utm\_campaign=reddit](https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss/?utm_source=sale2022&utm_medium=10freewebscrapers&utm_campaign=reddit)
us3w6u
Octoparse_ideas
Octoparseideas
t3_us3w6u
https://www.reddit.com/r/Octoparse_ideas/comments/us3w6u/10_free_web_scrapers_that_you_cannot_miss_in_2022/
5/18/2022 3:37:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10 FREE Web Scrapers That You Cannot Miss in 2022
False
1
us3w6u
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1029
Posted
3/23/2023 4:10:28 AM
As an eCommerce seller, you must want to learn how to scrape data and use it to boost your online busines. Check out the infographics and article to find out the answer! [\#ecommercebusiness](https://twitter.com/hashtag/ecommercebusiness?src=hashtag_click) [\#WebScrapping](https://twitter.com/hashtag/WebScrapping?src=hashtag_click)
[https://www.octoparse.com/blog/web-scraping-and-ecommerce-business](https://www.octoparse.com/blog/web-scraping-and-ecommerce-business)
11z8bm0
Octoparse_ideas
Octoparseideas
t3_11z8bm0
https://www.reddit.com/r/Octoparse_ideas/comments/11z8bm0/how_to_scrape_data_and_use_it_to_boost_your/
3/23/2023 4:10:28 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Data and Use it to Boost Your Online Business
False
1
11z8bm0
0
32057
3
3
1
1.47058823529412
0
0
0
0
37
54.4117647058824
68
Red
10
Dash Dot Dot
20
No
1028
Posted
8/23/2022 8:28:20 AM
Nearly all social media influencers are on Twitter, where you can also find their huge following lists with a mountain of data that you can use for analytics, promotion, and insight. Unlike other social media sites, Twitter actively allows people to use and do good with its public data.
Therefore, this article will assist you in how to scrape Twitter followers’ information via Python Twint and no-coding methods.
[https://www.octoparse.com/blog/web-scraping-job-postings/?utm\_source=2022q3&utm\_medium=how-to-scrape-twitter-followers&utm\_campaign=reddit](https://www.octoparse.com/blog/web-scraping-job-postings/?utm_source=2022q3&utm_medium=how-to-scrape-twitter-followers&utm_campaign=reddit)
wvj73w
u_Octoparseideas
Octoparseideas
t3_wvj73w
https://www.reddit.com/r/u_Octoparseideas/comments/wvj73w/how_to_scrape_twitter_followers_and_export_in/
8/23/2022 8:28:20 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How To Scrape Twitter Followers and Export in Excel
False
1
wvj73w
0
32057
3
3
1
0.87719298245614
0
0
0
0
68
59.6491228070175
114
Red
10
Dash Dot Dot
20
No
1027
Posted
6/24/2022 2:21:45 AM
Data crawling is used for data extraction and refers to collecting data from either the world wide web or from any document or file. Here, I’d like to introduce 3 ways to crawl data from a website, and the pros and cons of each approach.
[https://www.octoparse.com/blog/how-to-crawl-data-from-a-website/?utm\_source=sale2022&utm\_medium=crawldatafromawebsite&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-crawl-data-from-a-website/?utm_source=sale2022&utm_medium=crawldatafromawebsite&utm_campaign=reddit)
vjdj2i
u_Octoparseideas
Octoparseideas
t3_vjdj2i
https://www.reddit.com/r/u_Octoparseideas/comments/vjdj2i/how_to_crawl_data_from_a_website/
6/24/2022 2:21:45 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Crawl Data from a Website
False
1
vjdj2i
0
32057
3
3
1
1.12359550561798
1
1.12359550561798
0
0
45
50.561797752809
89
Red
10
Dash Dot Dot
20
No
1026
Posted
3/29/2022 9:05:39 AM
Welcome to Octoparse's new training session videos: Parse with Octoparse in 3 minutes.
In this session, we will walk you through ho to use Octoparse in 16 lessons in 5 parts, which are the introduction, basics, intermediate use, advanced use, and troubleshooting.
Find out lessons 1-8 below:
[https://youtu.be/jn8Ue15RVPQ](https://youtu.be/jn8Ue15RVPQ)
[https://youtu.be/QlJst0SfUG8](https://youtu.be/QlJst0SfUG8)
[https://youtu.be/o7RNlQdQIqM](https://youtu.be/o7RNlQdQIqM)
[https://youtu.be/1U8kxcWXQSI](https://youtu.be/1U8kxcWXQSI)
[https://youtu.be/XQYYF\_i16kM](https://youtu.be/XQYYF_i16kM)
[https://youtu.be/vzSE-SL-l4s](https://youtu.be/vzSE-SL-l4s)
[https://youtu.be/gyDV\_4Jeq\_I](https://youtu.be/gyDV_4Jeq_I)
[https://youtu.be/\_dQKYS9zXzc](https://youtu.be/_dQKYS9zXzc)
tqvm3p
u_Octoparseideas
Octoparseideas
t3_tqvm3p
https://www.reddit.com/r/u_Octoparseideas/comments/tqvm3p/parse_with_octoparse_in_3_minutes/
3/29/2022 9:05:39 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Parse with Octoparse in 3 minutes
False
1
tqvm3p
0
32057
3
3
2
1.68067226890756
0
0
0
0
63
52.9411764705882
119
Red
10
Dash Dot Dot
20
No
1025
Posted
9/13/2021 8:57:10 AM
Travel rules are currently changing with the Covid case curve. With the disease’s Delta variant, the cases are rising. As I am compiling this article, the EU is considering reimposing travel restrictions on U.S. visitors.
Anyway, I have built my Tripadvisor scraper with Octoparse and crawled down the information of destinations that are open to U.S. citizens. Always get prepared for a refreshing trip.
Note: If you are setting out to [these countries](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html), you may want to check if vaccination or quarantine is needed.
By the way, [web scraping](https://en.wikipedia.org/wiki/Web_scraping) is definitely the best way to help us pull down the web data and so we can sift through it and get the most value out of it. I will be showing how it helps me get the travel data.
[Geo Map generated by mapchart.net](https://preview.redd.it/wncnyazsi8n71.png?width=698&format=png&auto=webp&v=enabled&s=5f43df49ffb9098feb9a5337f2cdb702e683c057)
# Web Scraping Travel Data
Do you have any idea about [big data in tourism](https://www.octoparse.com/blog/big-data-in-tourism)?
Business guys in the travel industry are tracking all kinds of data, for example, business data of travel agents and visitors’ behavioral data on all travel-related platforms. They may know your traveling habits better than you. The whole industry is leveraging big data to launch the right product, find the right people to pay for their services.
Web scraping is the tech that makes this possible.
Well as a traveler, I want web scraping travel data to serve my needs — find destinations among the most attractive and get the guides from Tripadvisor for my reference.
**What I am going to do**
* First of all, I need a list of countries to look into.
* Secondly, I will use a web scraping tool, Octoparse, to build a Tripadvisor scraper and crawl these countries’ travel data.
* Finally, I am going to pack my baggage and head for the destination that fits my travel taste most!
# Where Can an American Go
So, where can an American go for travel now?
[This article by CNN](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html) listed the destinations that are open to the U.S.(the list might be updating now and then).
What I wanted to do is to pull all the country names on this web page down into a spreadsheet so I can paste them into Octoparse to get more specific data from Tripadvisor.
[Octoparse: How to get list information on a web page into excel](https://preview.redd.it/47rtyaowi8n71.png?width=699&format=png&auto=webp&v=enabled&s=2f55d1b98be04eb9b9e71e295bd7fb3a44acbb88)
Octoparse can easily get list information on a web page into excel or CSV.
This is extremely helpful when you want to get a list of URLs or a list of data, which you want to paste and search on another platform, or import into a data analytics software for analysis.
Now that I have got the text list of destinations, I am going to build a TripAdvisor scraper to get specific data about these places.
# Build a TripAdvisor Scraper
The data I am going to crawl from Tripadvisor:
* I want to check the travel popularity of these countries. I will consult with the number of reviews about the country on Tripadvisor. (My hypothesis: more visits, more reviews.)
* I have my travel theme. I am a nature lover interested in outdoor events and nature sightseeing. I will get the tag information of these destinations so that I can filter through and niche down to the perfect place where I can chase the wind, play on the beach or appreciate the grandeur of a peak.
* I will save the URL of travel guides on Tripadvisor for further travel planning. (Thanks contributors!)
## Batch Generate URLs with Country Names
Where to get this data? This is a sample page: [Tripadvisor Nepal](https://www.tripadvisor.com/Search?q=Nepal&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1&rf=2).
With the list of country names I have scraped in the previous step, I can batch generate all Tripadvisor country pages with Octoparse.
[Keep reading: how to create your own TripAdvisor scraper](https://www.octoparse.com/blog/tripadvisor-scraper-top-destinations-open-to-the-us-citizens-under-covid/?re=)
pnbokm
u_Octoparseideas
Octoparseideas
t3_pnbokm
https://www.reddit.com/r/u_Octoparseideas/comments/pnbokm/tripadvisor_scraper_top_destinations_open_to_the/
9/13/2021 8:57:10 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Tripadvisor Scraper: Top Destinations Open to the U.S. Citizens under Covid
False
1
pnbokm
0
32057
3
3
15
2
0
0
0
0
357
47.6
750
Red
10
Dash Dot Dot
20
No
1024
Posted
7/5/2021 7:18:54 AM
[removed]
oe1zcp
u_Octoparseideas
Octoparseideas
t3_oe1zcp
https://www.reddit.com/r/u_Octoparseideas/comments/oe1zcp/how_does_octoparse_work_as_a_problemsolver_for/
7/5/2021 7:18:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How Does Octoparse Work As A Problem-Solver for Data Scraping?
False
1
oe1zcp
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
1023
Posted
8/31/2022 7:17:16 AM
This article will serve as a guide to give you insights into the Data Extraction procedure, its types, and its perks. Additionally, we will talk about the top 10 data extraction tools to watch out for in 2022.
[https://www.octoparse.com/blog/top-data-extraction-tools/?utm\_source=2022q3&utm\_medium=top-data-extraction-tools&utm\_campaign=reddit](https://www.octoparse.com/blog/top-data-extraction-tools/?utm_source=2022q3&utm_medium=top-data-extraction-tools&utm_campaign=reddit)
x271pv
Octoparse_ideas
Octoparseideas
t3_x271pv
https://www.reddit.com/r/Octoparse_ideas/comments/x271pv/top_10_data_extraction_tools_in_2022/
8/31/2022 7:17:16 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 10 Data Extraction Tools in 2022
False
1
x271pv
0
32057
3
3
5
6.17283950617284
0
0
0
0
47
58.0246913580247
81
Red
10
Dash Dot Dot
20
No
1022
Posted
2/21/2022 10:22:38 AM
Original post: [https://www.octoparse.com/blog/octoparse-85-empowers-the-local-web-scraping/?re=](https://www.octoparse.com/blog/octoparse-85-empowers-the-local-web-scraping/?re=)
Here is the exciting news: Octoparse 8.5 is now released with game-changing new features and major improvements. Previously we all know that we can count on cloud scraping when it comes to scraping fast at scale, but this time, we want to make local scraping just as competitive.
https://preview.redd.it/arqgth38x5j81.png?width=800&format=png&auto=webp&v=enabled&s=0051e4273998f4b3e23685d24a50c3ca38150724
&#x200B;
https://preview.redd.it/7l6t0pc9x5j81.png?width=800&format=png&auto=webp&v=enabled&s=27146305a4cd9b869488f11c048e577795fab265
sxqbgu
u_Octoparseideas
Octoparseideas
t3_sxqbgu
https://www.reddit.com/r/u_Octoparseideas/comments/sxqbgu/octoparse_85_empowering_local_scraping_and_more/
2/21/2022 10:22:38 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse 8.5: Empowering Local Scraping and More
False
1
sxqbgu
0
32057
3
3
4
5
1
1.25
0
0
38
47.5
80
Red
10
Dash Dot Dot
20
No
1021
Posted
2/15/2022 8:02:50 AM
[https://www.octoparse.com/blog/movie-crawler-scraping-100-000plus-movie-information/?re=](https://www.octoparse.com/blog/movie-crawler-scraping-100-000plus-movie-information/?re=)
What data can you get with a movie scraper? Well, check it out below:
* Movie name
* Year
* Category
* Ratings
* Introduction
* Cast
* Cover image (URL)
And you may scrape other data such as movie reviews, or TV show information as long as they are there on the web page. With the help of Octoparse, you can create and customize your scraper to get whatever data you want once you get a hang of it.
https://preview.redd.it/7985mzxyeyh81.png?width=1080&format=png&auto=webp&v=enabled&s=62f7c835438c7a854d04245bb7286e12bfecc944
ssxurq
u_Octoparseideas
Octoparseideas
t3_ssxurq
https://www.reddit.com/r/u_Octoparseideas/comments/ssxurq/easytouse_movie_scraper_scraping_movies_from_imdb/
2/15/2022 8:02:50 AM
2/15/2022 8:08:33 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Easy-to-use Movie Scraper | Scraping Movies from IMDb, Flixster, etc.
False
1
ssxurq
0
32057
3
3
1
1.01010101010101
1
1.01010101010101
0
0
53
53.5353535353535
99
Red
10
Dash Dot Dot
20
No
1020
Posted
8/29/2022 7:52:28 AM
If you do not have any technical knowledge about web scraping and need a tool with multiple and incredible options, just give a read to this article as we are going to discuss a free Quora scraper. With this free tool, you will be able to get Quora data in the form of JSON or CSV files.
[https://www.octoparse.com/blog/web-scraping-quora/?utm\_source=2022q3&utm\_medium=web-scraping-quora&utm\_campaign=reddit](https://www.octoparse.com/blog/web-scraping-quora/?utm_source=2022q3&utm_medium=web-scraping-quora&utm_campaign=reddit)
x0iw6a
Octoparse_ideas
Octoparseideas
t3_x0iw6a
https://www.reddit.com/r/Octoparse_ideas/comments/x0iw6a/how_to_scrape_questions_and_answers_data_from/
8/29/2022 7:52:28 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Questions and Answers Data from Quora
False
1
x0iw6a
0
32057
3
3
3
3.125
0
0
0
0
51
53.125
96
Red
10
Dash Dot Dot
20
No
1019
Posted
6/11/2021 6:30:44 AM
[removed]
nx994i
u_Octoparseideas
Octoparseideas
t3_nx994i
https://www.reddit.com/r/u_Octoparseideas/comments/nx994i/price_comparison_website_how_to_build_it_and/
6/11/2021 6:30:44 AM
1/1/0001 12:00:00 AM
False
False
5
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Price Comparison Website: How To Build It And Source The Data
False
0.86
nx994i
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
1018
Posted
5/30/2022 9:05:57 AM
Nowadays people use PDF on a large scale for reading, presenting, and many other purposes. And many websites store data in a PDF file for viewers to download instead of posting on the web pages, which brings challenges to web scraping. You can view, save and print PDF files with ease. But the problem is, PDF is designed to keep the integrity of the file. It is more like an "electronic paper" format to make sure contents would look the same on any computer at any time. So it is difficult to edit a PDF file and export data from it.
Fortunately, there are some solutions that help extract data from PDF into Excel.
[https://www.octoparse.com/blog/how-to-extract-pdf-into-excel/?utm\_source=sale2022&utm\_medium=extractdatafrompdf&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-extract-pdf-into-excel/?utm_source=sale2022&utm_medium=extractdatafrompdf&utm_campaign=reddit)
v0w2iw
u_Octoparseideas
Octoparseideas
t3_v0w2iw
https://www.reddit.com/r/u_Octoparseideas/comments/v0w2iw/how_to_extract_data_from_pdf_to_excel_without/
5/30/2022 9:05:57 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Extract Data from PDF to Excel Without Coding skills
False
1
v0w2iw
0
32057
3
3
5
3.2258064516129
2
1.29032258064516
0
0
79
50.9677419354839
155
Red
10
Dash Dot Dot
20
No
1017
Posted
9/10/2021 3:16:54 AM
[http://www.dataextraction.io/?p=1125/?re=](http://www.dataextraction.io/?p=1125/?re=)
[Indeed](https://www.indeed.com/about) is one of the best job boards in the world. Since its inception in 2004, it has always been a go-to website for many job seekers all over the world. Thousands of jobs across several niches are posted daily and without a doubt, the website is loaded regularly with data.
For those that know about Indeed job scraping, the data on Indeed has proven to be more than valuable. In case you are wondering how to scrape data from Indeed or why you even need this data, this is the article for you.
In this article, we will look at what you can do with Indeed data and how to get started with Indeed job scraping.
# Why do you need an Indeed Job scraper
## 1. Build a job board
People are always looking for jobs online. Creating an online directory where jobs are constantly posted is a great way for you to help people and make money. However, to do that, you would need to scrape job postings from a big job board like Indeed.
With job data from Indeed, you can create a city-specific job board that only displays jobs from a particular city. Industry-specific job boards are also in high demand, so you can also consider that too. All you have to do is to create a specialized web crawler that can extract these specific data, download the data and build your job board.
## 2. Understand the job market
The data you get with Indeed job scraping can give you lots of valuable insights into the job market. Indeed is filled with lots of data about jobs salaries, job requirements, skills, and experience. Every job posting has the salary it offers as well as the skills and experience it requires.
With all these data, your HR team can better analyze job trends and the entire labor market in general. [Job analysis](https://www.businessmanagementideas.com/human-resource-management-2/uses-of-job-analysis/uses-of-job-analysis-human-resource-management/19260) is a key part of human resource management. But without job data, there can’t be any job analysis. Hence, you need a job scraper.
With all that has been said, you are most likely thinking it must be very difficult to build an Indeed scraper. Well, if that’s the question on your mind right now, stay tuned because, in this next section, we will be answering that question.
# Is it hard to build an Indeed scraper?
Let’s go ahead and answer this question straight away. NO! It is not hard to build an Indeed scraper. Even if you don’t have any programming experience, you can still build an Indeed scraper by using a web scraping tool.
A web scraping tool allows you to create a web crawler by simply interacting with software or web application. No coding experience and technical knowledge are needed. Octoparse is one of [many web scraping tools](https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss#:~:text=Unlike%20other%20web%20scrapers%20that,%2C%20JavaScript%2C%20cookies%20and%20etc.&text=Octoparse%20can%20even%20deal%20with,by%20parsing%20the%20source%20code.). It has all the [features](https://www.octoparse.com/featurescomparison) you need for a great web scraping experience. By using Octoparse, Indeed job scraping would be easy-peasy for you.
# How to use Octoparse to build an Indeed Scraper?
With Octoparse version 8.1 beta, you can scrape data from Indeed by;
* Using the Indeed Prebuilt template
* Building a crawler from Scratch
## Using the Indeed Prebuilt template
* Click the Indeed template on the homepage to use the ready-made template directly
https://preview.redd.it/2zagax9gflm71.png?width=989&format=png&auto=webp&v=enabled&s=339eb89bb27e15fa57b957677ee0f99ba8b04c9f
To learn more about templates and how to use them, click [here](https://helpcenter.octoparse.com/hc/en-us/articles/360028582331-Introducing-Easy-Template-A-Scraping-Solution-for-Muggles).
## Building a crawler from scratch
1. Go to the web page you want to scrape
2. Create the workflow
3. Start the extraction
**Step 1. Go to the web page you want to scrape**
* Enter the Indeed URL page you want to scrape in The URL bar on the homepage
* Click “Start”
This is a URL for this example.
[https://www.indeed.com/jobs?q=devops&l=Dallas-Fort%20Worth,%20TX&radius=50](https://www.indeed.com/jobs?q=devops&l=Dallas-Fort%20Worth,%20TX&radius=50)
https://preview.redd.it/b3udeu5iflm71.png?width=984&format=png&auto=webp&v=enabled&s=5c7835011fcf1d34cf310fa3f85b985318242bd2
**Step 2. Create the workflow**
* Click “Auto-detect web page data”
https://preview.redd.it/n88e4nqjflm71.png?width=1172&format=png&auto=webp&v=enabled&s=a2e791539cb05b8c9163201a197ce05267b12982
* Wait till you see “Auto-detect completed”
* Check the data preview to see if there’s any unnecessary data field you would like to delete.
https://preview.redd.it/eavg4s8lflm71.png?width=948&format=png&auto=webp&v=enabled&s=dbe63c1879c0e562ecb74e26083a17bd749b4066
* Click on “Create workflow”
https://preview.redd.it/o3c0m32oflm71.png?width=684&format=png&auto=webp&v=enabled&s=01aacee23008ce2de29756b386b8fbe3e9b0512f
**Step 3. Start Extraction**
* Setup “[wait before action](https://helpcenter.octoparse.com/hc/en-us/articles/900001640566)”.
* Click the setting icon beside the “extract data” button
* Tick “wait before action”
* Set it to 1 or 2s.
* Start Extraction
* Click “Save”
* Click “Run”
* Select “[Run task on your device](https://helpcenter.octoparse.com/hc/en-us/articles/360018281491-Run-tasks-on-local-machine)” to run the task on your PC, or select “[Run task in the Cloud](https://helpcenter.octoparse.com/hc/en-us/articles/360018008712-Run-Schedule-tasks-in-the-cloud)” to run the task in the cloud.
* You can also [schedule a task](https://helpcenter.octoparse.com/hc/en-us/articles/900001763803-Schedule-tasks-to-run) to run the task regularly.
https://preview.redd.it/si3mozxpflm71.png?width=1025&format=png&auto=webp&v=enabled&s=a26c93cfa8f2b7000d55742861cbc21b08918903
Once all that is done, it will take a few seconds for the software to load and present your data. Here’s a sample of what your extracted data will look like:
https://preview.redd.it/0fh6z1crflm71.png?width=1279&format=png&auto=webp&v=enabled&s=f6fe19883b4bcb39e849e8b2b3fc97f72f499b7b
# Conclusion
Octoparse is a great web scraping tool that gives you all you need to scrape job postings from Indeed. So much can be done with the job data from Indeed. Hopefully, with this article, we’ve been able to inform you about what you can do with this data. If you have any issues, complaints, or questions, feel free to [contact us](https://helpcenter.octoparse.com/hc/en-us/requests/new) anytime for help.
plcxfn
u_Octoparseideas
Octoparseideas
t3_plcxfn
https://www.reddit.com/r/u_Octoparseideas/comments/plcxfn/job_scraping_easily_scrape_job_posting_from_indeed/
9/10/2021 3:16:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Job Scraping: Easily Scrape Job Posting from Indeed
False
1
plcxfn
0
32057
3
3
16
1.57170923379175
13
1.2770137524558
0
0
514
50.4911591355599
1018
Red
10
Dash Dot Dot
20
No
1016
Posted
5/13/2022 6:47:55 AM
Octoparse is a go-to software for connecting your business to any web data and it is simple enough that you can get up and running right away, but also contains powerful advanced features that you can use to achieve scraping websites of all kinds.
[https://youtu.be/Y2ArkGbigUE](https://youtu.be/Y2ArkGbigUE)
uolqna
Octoparse_ideas
Octoparseideas
t3_uolqna
https://www.reddit.com/r/Octoparse_ideas/comments/uolqna/web_scraping_solution_with_octoparse/
5/13/2022 6:47:55 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web scraping solution with Octoparse
False
1
uolqna
0
32057
3
3
4
7.54716981132075
0
0
0
0
21
39.622641509434
53
Red
10
Dash Dot Dot
20
No
1015
Posted
12/28/2021 4:02:46 AM
https://youtu.be/OneU-njIsXE
rq66lq
webscraping
Octoparseideas
t3_rq66lq
https://www.reddit.com/r/webscraping/comments/rq66lq/how_to_scrape_app_reviews_from_google_play_with/
12/28/2021 4:02:46 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape app reviews from Google Play with Octoparse
False
0.25
rq66lq
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1014
Posted
12/16/2021 2:47:42 AM
https://youtu.be/tQEy8aqyfUE
rhgib1
webscraping
Octoparseideas
t3_rhgib1
https://www.reddit.com/r/webscraping/comments/rhgib1/how_to_scrape_linkedin_jobs_with_octoparse/
12/16/2021 2:47:42 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape LinkedIn jobs with Octoparse
False
0.4
rhgib1
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1013
Posted
6/13/2022 4:05:47 AM
Take an EXTRA 10% off everything on Jun.15th only!
【Standard Year】Save $271 + FREE crawler + 1-on-1 training
【Professional Year】Save $800 + FREE crawler\*3 + 1-on-1 training\*3
👉 Click to check out the deals: [https://www.octoparse.com/summer-sale-2022/?utm\_source=reddityure&utm\_medium=startingin2days&utm\_campaign=22summersale](https://www.octoparse.com/summer-sale-2022/?utm_source=reddityure&utm_medium=startingin2days&utm_campaign=22summersale)
https://preview.redd.it/wo37ldm4cb591.png?width=800&format=png&auto=webp&v=enabled&s=37142fd0eb67cf31f340b238a1b6183738d62a4a
vb3h8w
u_Octoparseideas
Octoparseideas
t3_vb3h8w
https://www.reddit.com/r/u_Octoparseideas/comments/vb3h8w/octoparse_summer_sale_starts_in_2_days/
6/13/2022 4:05:47 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
💥 Octoparse Summer Sale Starts in 2 Days
False
1
vb3h8w
0
32057
3
3
2
2.8169014084507
0
0
0
0
44
61.9718309859155
71
Red
10
Dash Dot Dot
20
No
1012
Posted
7/18/2022 8:29:27 AM
[https://youtu.be/8U3-VLSp3vA](https://youtu.be/8U3-VLSp3vA)
Web scraping itself is not illegal. Although scraping public data is now ruled legal according to a U.S. appeals court ruling, for people who want to give it a shot, it is still important to identify the risks around web scraping practices. Check out Octoparse's new video to learn if web scraping is legal and why.
w1u5k3
u_Octoparseideas
Octoparseideas
t3_w1u5k3
https://www.reddit.com/r/u_Octoparseideas/comments/w1u5k3/is_web_scraping_legal_and_why/
7/18/2022 8:29:27 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Is Web Scraping Legal and Why?
False
1
w1u5k3
0
32057
3
3
1
1.49253731343284
2
2.98507462686567
0
0
35
52.2388059701493
67
Red
10
Dash Dot Dot
20
No
1011
Posted
2/21/2022 10:24:01 AM
Original post: [https://www.octoparse.com/blog/octoparse-85-empowers-the-local-web-scraping/?re=](https://www.octoparse.com/blog/octoparse-85-empowers-the-local-web-scraping/?re=)
Here is the exciting news: Octoparse 8.5 is now released with game-changing new features and major improvements. Previously we all know that we can count on cloud scraping when it comes to scraping fast at scale, but this time, we want to make local scraping just as competitive.
https://preview.redd.it/2364t7phx5j81.png?width=800&format=png&auto=webp&v=enabled&s=687c28c60b51bf2bb47ce42fe064bdbbd5b050d9
&#x200B;
https://preview.redd.it/5wy3tf3jx5j81.png?width=800&format=png&auto=webp&v=enabled&s=cacaa6f4a36d5a94d777dfdf3d6ce3b424f40c74
sxqc82
Octoparse_ideas
Octoparseideas
t3_sxqc82
https://www.reddit.com/r/Octoparse_ideas/comments/sxqc82/octoparse_85_empowering_local_scraping_and_more/
2/21/2022 10:24:01 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse 8.5: Empowering Local Scraping and More
False
1
sxqc82
0
32057
3
3
4
5
1
1.25
0
0
38
47.5
80
Red
10
Dash Dot Dot
20
No
1010
Posted
12/1/2021 2:28:49 AM
[https://youtu.be/aEh1coudY9s](https://youtu.be/aEh1coudY9s)
In this tutorial, I’ll show you how to use web scraping templates in Octoparse 8.4 to extract Amazon product reviews in 3 easy steps.
The ready-to-use template is a unique feature of Octoparse. They are prebuilt crawlers that can be used to scrape popular websites such as Amazon, Facebook and many more. Since all the data fields are pre-set, there’s no need to configure the crawlers by yourself. Simply enter the search value and it will fetch the data for you right away.
The Amazon Template we picked in this video need to gather ASINs, review date time for gathering the rating, review date, review text of the product.
The sample output gives you an idea of what the end result will look like when it completes. We should be able to get all the information in a nice and structured format!
r643r3
Octoparse_ideas
Octoparseideas
t3_r643r3
https://www.reddit.com/r/Octoparse_ideas/comments/r643r3/how_to_scrape_amazon_product_reviews_in_three/
12/1/2021 2:28:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Amazon product reviews in three easy steps
False
1
r643r3
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1009
Posted
11/23/2021 7:15:25 AM
TAKE 30% OFF when Renew or Upgrade
【Standard Year】SAVE $200!
【Professional Year】SAVE $500!
👉 Get free crawlers & 1-on-1 training: [https://www.octoparse.com/2021-black-friday-sale/?re=](https://www.octoparse.com/2021-black-friday-sale/?re=)
&#x200B;
Check out the services you need:
[https://service.octoparse.com/data-service/?re=](https://service.octoparse.com/data-service/?re=)
[https://service.octoparse.com/ecommercedata/?re=](https://service.octoparse.com/ecommercedata/?re=)
[https://service.octoparse.com/socialmedia/?re=](https://service.octoparse.com/socialmedia/?re=)
[https://service.octoparse.com/contentaggregation/?re=](https://service.octoparse.com/contentaggregation/?re=)
[https://service.octoparse.com/enterprise/?re=](https://service.octoparse.com/enterprise/?re=)
[https://service.octoparse.com/webscrapingtemplates/?re=](https://service.octoparse.com/webscrapingtemplates/?re=)
r07l5a
u_Octoparseideas
Octoparseideas
t3_r07l5a
https://www.reddit.com/r/u_Octoparseideas/comments/r07l5a/hurry_up_black_friday_is_ending_soon/
11/23/2021 7:15:25 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
👏 Hurry up! Black Friday Is Ending Soon
False
1
r07l5a
0
32057
3
3
1
0.813008130081301
0
0
0
0
80
65.0406504065041
123
Red
10
Dash Dot Dot
20
No
1008
Posted
12/6/2022 8:51:43 AM
**Whether you love football or not, you must be surrounded by FIFA World Cup 2022. Does the World Cup in 2022 "truly" defy prediction? We scraped betting odds from professional betting agencies and tried to find the answer.**
**Check out this article to find out more information:**
[https://www.octoparse.com/blog/scrap-fifa-world-cup-betting-odds?utm\_source=reddit&utm\_medium=social&utm\_campaign=article-promotion](https://www.octoparse.com/blog/scrap-fifa-world-cup-betting-odds?utm_source=reddit&utm_medium=social&utm_campaign=article-promotion)
ze0xzc
Octoparse_ideas
Octoparseideas
t3_ze0xzc
https://www.reddit.com/r/Octoparse_ideas/comments/ze0xzc/web_scraping_fifa_world_cup_2022_betting_odds/
12/6/2022 8:51:43 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Web Scraping FIFA World Cup 2022 Betting Odds
False
1
ze0xzc
0
32057
3
3
1
1.11111111111111
3
3.33333333333333
0
0
57
63.3333333333333
90
Red
10
Dash Dot Dot
20
No
1007
Posted
9/30/2021 7:18:56 AM
Nowadays companies use an ATS (Application Tracking System) to post jobs and find the perfect candidates. But in a competitive industry like healthcare, candidates won’t necessarily come to your company’s career site and search for jobs. HR personnel would then copy their job lists and go to different career websites and post them. However, there are few problems with this manual approach. First, it requires a lot of manual work. When it comes to copy and paste for different career websites, it often involves clerical error. As a result, candidates get confused and give a listing a pass. Second, your posts on different sites won’t be updated. What it means is that once you change little details on the job description. You will have to manually go back and amend each post.
As a result, it is not hard to imagine even if you have good company, you don’t have the best fishing net to capture all the best candidates. However, there is an excellent tool that saves HR a ton of time as well as sourcing as many good candidates as possible from different career sites. A scraping tool like Octoparse is the answer to creating an automated job board that integrates with a company’s ATS and pushing the latest listings to different career sites.
Breathing is automated. It makes our life so easy. We realize its importance, even more, when we see people going on ventilators. But are your HR operations and processes automated? Is your job creation, listing, interviewing, and hiring process fully optimized?
We ask these questions as we see a lot of companies are unnecessarily struggling. Several enterprises are still manually posting their vacancies to different job boards. We at Octoparse believe it’s quite orthodox in the age of AI & data science.
“The world is changing whether you like it or not. Get involved or get left behind.” — Dave Waters
And so, in this article, we emphasize the importance of automating some of the cumbersome HR processes & operations. We shall demonstrate the use of Web crawling to scrape job listings from career sites. Then, we shall post scraped job data automatically to job boards of your choice using XML job feeds.
We provide you with a seamless job wrapping solution stack consisting of an Applicant tracking system (ATS), Octoparse Scraping services, and XML feeds. This would be easily integrable into your existing HR tech stack.
https://preview.redd.it/fhl687y3dlq71.png?width=512&format=png&auto=webp&v=enabled&s=066d058077526d3724f0649ecaeec3c2bd32be0a
Not to mention, this relieves the HR team from manual work and enables them to focus on more critical aspects of the hiring cycle i.e., creating a perfect job description, training & onboarding the talent etc.
[Keep reading: What is an Application Tracking System (ATS)?](http://www.dataextraction.io/?p=1135/?re=)
pyepi7
Octoparse_ideas
Octoparseideas
t3_pyepi7
https://www.reddit.com/r/Octoparse_ideas/comments/pyepi7/automate_job_feed_scraping_posting_to_scaleup/
9/30/2021 7:18:56 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Automate Job Feed Scraping & Posting To Scale-Up Your Business
False
1
pyepi7
0
32057
3
3
16
3.470715835141
7
1.51843817787419
0
0
233
50.5422993492408
461
Red
10
Dash Dot Dot
20
No
1006
Posted
3/7/2022 9:34:26 AM
*Originally published as* [*https://www.octoparse.com/blog/b2b-lead-generation-top-10-tools-for-digital-marketing/?re=*](https://www.octoparse.com/blog/b2b-lead-generation-top-10-tools-for-digital-marketing/?re=) *on March 7, 2022.*
B2B Lead generation is important for growing your business as it decides if your salespeople are going to close more deals or not.
In this article, you will learn about what B2B lead generation is, and how to conduct it effectively using some of the best tools in the market.
https://preview.redd.it/3glltt19lxl81.png?width=1350&format=png&auto=webp&v=enabled&s=fc034d3bde526bd7b881bda9225fa11a97703348
# What is B2B lead generation?
B2B [Lead generation](https://en.wikipedia.org/wiki/Lead_generation) is known as the process of identifying and attracting potential customers.
These people may be some random visitors who just happened to bump into your website (thanks to your SEO team) or people that are actually looking for certain help.
What they have in common is that they fit your customer persona — they are very likely looking for your product to help them get out of a dilemma! These people shall be the target of your marketing and sales team and definitely shall be absorbed into your lead generation system, or marketing-sales system, whatever you call it.
Lead generation opens the door to more sales. Yet, in most cases, your lead is not the one who walks straight over to the cashier and pays right away. That’s where cultivation comes in — to lead your potential customer from the stage of awareness to purchase. So, educate them, tell them what it is and how it works.
In short, to turn leads into sales, you’ll need to:
* Attract more visitors
* Identify potential buyers
* Cultivate potential customers
* Initiate the sale and close deals (hurray!)
# What are B2B sales leads?
If lead generation is about converting a stranger who has never heard about your company before into a customer who pays for your bill, then when we run into one of these strangers, we’d better be well aware of where he is — in the sales funnel.
People in different stages require customized attention. Handing over every potential lead to your sales team is not going to be an effective way to achieve more sales. There are good reasons why leads are generally categorized into two types: MQL (marketing qualified leads) and SQL (sales qualified leads).
[Source: https:\/\/www.digitalbrew.com\/6-sales-funnel-stages-video-amplifies\/](https://preview.redd.it/ulxwvqtalxl81.png?width=1342&format=png&auto=webp&v=enabled&s=6a48022b5e4ac02630344c40ac5367e8eb309091)
## What is MQL?
A marketing qualified lead lingers at the stage of awareness or interest. This is a group of people who are relevant to you but yet not ready to purchase.
People with these behaviors may be recognized as an MQL:
* Visit your website regularly
* Download your ebook or white paper
* Sign up for your next webinar
* Send emails or start an online chat
* Subscribe to your newsletter
That’s how people show their interests.
The marketing team shall help dig out their needs or escort them through the evaluation stage with a clear demo (or other resources) that will answer their worries perfectly.
## What is SQL?
A sales qualified lead is a consumer who is ready to buy. They know what they need and they are studying the specification, doing some comparisons until they are able to pick the best candidate and take out the wallet and say: “wrap it up”.
A sales qualified lead may:
* Ask for a price list with a full plan breakdown
* Send an email with concrete questions about what your services can do or not
* Schedule a phone call for a full product demo
* Submit a form that describes what challenges he is facing to the sales team to connect
* Enter his credit card info and sign up for a trial
They are prospects the salespeople would love to talk to (also who they shall talk to). Time to take quick action! Go dive deep into what they are looking for and draw out a custom solution that fits their needs or to make it further, go through a POC (proof of concept) to prove the capability of what you are offering.
# How to conduct lead generation that leads to sales?
This is teamwork — a marketing team to attract leads and cultivate and a sales team to select leads, communicate and make deals.
If we focus on lead generation only, the concept in many ways overlaps with what we call “inbound marketing” — to attract the MQLs and SQLs to your system by placing valuable content on where they are, setting gated resources so as to gather their information that feeds your sales system.
These materials would play a big part in lead generation:
* Ebooks that teach your customers how to get started
* Reports that uncover the trends and opportunities in the market
* Webinars that guide your customers through the difficult problems they encounter
* Landing pages that introduce specific solutions that your company offer
* Newsletters that inform them about industry news and updates
# 10 Top lead generation software and tools to grow your business
B2B lead generation is difficult, yet, having the right tools can make a big difference. Let’s take a look at some of the best tools out there that be used to work your way up through the entire lead generation process.
## #1 — SurveyMonkey
**Get to know your current customers**
SurveyMonkey is a questionnaire software that can help you learn about who are your customers and what do you love and hate about your business or products.
[SurveyMonkey](https://apply.surveymonkey.com/referral-demo/?grsf=hg4r6n) provides a free version with limited services:
* 10 questions
* 100 respondents
* 15 question types
* Light theme customization and templates
## #2 — Ahrefs
**Learn from your competitor**
Ahrefs is a search engine optimization (SEO) tool that helps analyze practically everything about your websites and at the same time, your competitors. Using Ahref, you can find out what your competitors have been doing to get leads, such as the keywords they are using and more. There’s a lot to dig out for digital marketers from the auto-generated report and get some good ideas for the next marketing campaigns.
[Ahrefs](https://ahrefs.com/pricing) doesn’t offer a free version but gives an extra 2-month period for annual billing. The functions of the paid version (From $99.95 per month) are:
* Rank Tracker
* Site Audit
* Site Explorer
* Content Explorer
* Keywords Explorer
* Alerts
## #3–SEMrush
**Optimize your site to get more organic traffic**
Compared with Aherfs, SEMrush is an SEO tool that focuses more on website self-checking. With it, we can quickly locate the problems of the sites, like broken links, incorrect tabs, and so on. Google spider would love to rank you higher for the ongoing website optimization.
Same as Ahrefs, [SEMrush](https://www.semrush.com/lp/sem/en/?ref=8131476429&utm_campaign=aio_%20campaign&utm_source=berush&utm_medium=promo&utm_term=23) only offers paid versions (From $99.95 per month). It provides:
* Branded reports
* Historical Data
* Extended limits
* White-label reports
* API access
* Extended limits and sharing options
* Google Data Studio Integration
* …..
## #4–Mailchimp
**Leads generation with email marketing**
[Based on 2018 data](https://www.smartinsights.com/email-marketing/email-communications-strategy/email-marketing-still-worth-taking-seriously-2018/), email marketing, the old fashion way, still works effectively for marketing today. Despite the reasons, it’s much important to figure out how an email marketing automation platform, like MailChimp, brings more business leads.
Digital marketers can use Mailchimp to cover the whole process of a marketing campaign by scheduling emails or setting up trigger emails or Drip-Feed emails.
specifically, for the e-Commerce business, MailChimp is available for link tracking, offering a relatively complete report on the total orders and the average revenue brought by the email campaign.
[Mailchimp](https://app.mobilemonkey.com/signup?ref=4GAWZw) offers a robust free version for customizing the layout of an e...
t8ljkf
Octoparse_ideas
Octoparseideas
t3_t8ljkf
https://www.reddit.com/r/Octoparse_ideas/comments/t8ljkf/how_to_conduct_b2b_lead_generation_10_tips_and/
3/7/2022 9:34:26 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Conduct B2B Lead Generation | 10 Tips and Tools
False
1
t8ljkf
0
32057
3
3
73
5.50943396226415
15
1.13207547169811
0
0
649
48.9811320754717
1325
Red
10
Dash Dot Dot
20
No
1005
Posted
8/9/2021 3:04:53 AM
[removed]
p0t01t
u_Octoparseideas
Octoparseideas
t3_p0t01t
https://www.reddit.com/r/u_Octoparseideas/comments/p0t01t/how_to_scrape_amazon_with_octoparse/
8/9/2021 3:04:53 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Amazon with Octoparse
False
1
p0t01t
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
1004
Posted
12/8/2021 1:55:57 AM
https://youtu.be/dxKTTKlBTQo
rbfkyu
webscraping
Octoparseideas
t3_rbfkyu
https://www.reddit.com/r/webscraping/comments/rbfkyu/how_to_scrape_facebook_account_with_octoparse/
12/8/2021 1:55:57 AM
1/1/0001 12:00:00 AM
False
False
4
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Facebook account with Octoparse
False
0.83
rbfkyu
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1003
Posted
8/25/2022 8:44:32 AM
LinkedIn job postings are a treasure trove of information and are among the most effective ways to build your network and attract new connections. The big social media site has tens of millions of potential candidates, and it's also one of the best places to find open job positions. But finding these postings can really be time-consuming if you're doing it manually. This article shows you how to scrape LinkedIn job postings, including a list of all current job posts as well as a way to search for specific jobs.
[https://www.octoparse.com/blog/linkedin-job-scraper/?utm\_source=2022q3&utm\_medium=linkedin-job-scraper&utm\_campaign=reddit](https://www.octoparse.com/blog/linkedin-job-scraper/?utm_source=2022q3&utm_medium=linkedin-job-scraper&utm_campaign=reddit)
wx8u5p
u_Octoparseideas
Octoparseideas
t3_wx8u5p
https://www.reddit.com/r/u_Octoparseideas/comments/wx8u5p/best_linkedin_job_scraper_to_extract_job_postings/
8/25/2022 8:44:32 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Best LinkedIn Job Scraper to Extract Job Postings from LinkedIn
False
1
wx8u5p
0
32057
3
3
4
3.07692307692308
0
0
0
0
74
56.9230769230769
130
Red
10
Dash Dot Dot
20
No
1002
Posted
6/11/2021 6:30:44 AM
[removed]
nx994i
u_Octoparseideas
Octoparseideas
t3_nx994i
https://www.reddit.com/r/u_Octoparseideas/comments/nx994i/price_comparison_website_how_to_build_it_and/
6/11/2021 6:30:44 AM
1/1/0001 12:00:00 AM
False
False
6
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Price Comparison Website: How To Build It And Source The Data
False
1
nx994i
0
32057
3
3
Red
10
Dash Dot Dot
20
No
1001
Posted
8/30/2022 4:11:17 AM
Google Maps is an excellent source for finding business leads/contacts. As data about all the businesses worldwide are available in Google Maps, it can be one of the go-to resources if you are researching local businesses and need their data.
In this blog post, we will discuss what data we can scrape from Google Maps and how to export Google Maps search results to an excel or CSV file.
[https://www.octoparse.com/blog/export-google-maps-search-results-to-excel/?utm\_source=2022q3&utm\_medium=export-google-maps-search-results-to-excel&utm\_campaign=reddit](https://www.octoparse.com/blog/export-google-maps-search-results-to-excel/?utm_source=2022q3&utm_medium=export-google-maps-search-results-to-excel&utm_campaign=reddit)
x19an6
Octoparse_ideas
Octoparseideas
t3_x19an6
https://www.reddit.com/r/Octoparse_ideas/comments/x19an6/can_you_export_google_map_search_results_to_excel/
8/30/2022 4:11:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Can You Export Google Map Search Results to Excel
False
1
x19an6
0
32057
3
3
8
6.34920634920635
0
0
0
0
69
54.7619047619048
126
Red
10
Dash Dot Dot
20
No
1000
Posted
5/17/2022 2:51:13 AM
Technology always takes a dominant position in the economy and society. Enterprises are trying hard to seek skilled programmers. For the people who want to find a job in this industry, it is quite useful to have a full picture of which programming language is the most demanding on the market, and which one can get the highest pay. Now, let’s take a look at the 15 highest-paying programming languages in 2022.
[https://www.octoparse.com/blog/15-highest-paying-programming-languages-in-2017/?utm\_source=sale2022&utm\_medium=15languages&utm\_campaign=reddit](https://www.octoparse.com/blog/15-highest-paying-programming-languages-in-2017/?utm_source=sale2022&utm_medium=15languages&utm_campaign=reddit)
urcrqx
Octoparse_ideas
Octoparseideas
t3_urcrqx
https://www.reddit.com/r/Octoparse_ideas/comments/urcrqx/15_highest_paying_programming_languages_in_2022/
5/17/2022 2:51:13 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
15 Highest Paying Programming Languages in 2022
False
1
urcrqx
0
32057
3
3
2
1.70940170940171
1
0.854700854700855
0
0
64
54.7008547008547
117
Red
10
Dash Dot Dot
20
No
999
Posted
5/17/2022 2:50:18 AM
Technology always takes a dominant position in the economy and society. Enterprises are trying hard to seek skilled programmers. For the people who want to find a job in this industry, it is quite useful to have a full picture of which programming language is the most demanding on the market, and which one can get the highest pay. Now, let’s take a look at the 15 highest-paying programming languages in 2022.
[https://www.octoparse.com/blog/15-highest-paying-programming-languages-in-2017/?utm\_source=sale2022&utm\_medium=15languages&utm\_campaign=reddit](https://www.octoparse.com/blog/15-highest-paying-programming-languages-in-2017/?utm_source=sale2022&utm_medium=15languages&utm_campaign=reddit)
urcr7j
u_Octoparseideas
Octoparseideas
t3_urcr7j
https://www.reddit.com/r/u_Octoparseideas/comments/urcr7j/15_highest_paying_programming_languages_in_2022/
5/17/2022 2:50:18 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
15 Highest Paying Programming Languages in 2022
False
1
urcr7j
0
32057
3
3
2
1.70940170940171
1
0.854700854700855
0
0
64
54.7008547008547
117
Red
10
Dash Dot Dot
20
No
998
Posted
12/27/2021 9:46:30 AM
[https://www.octoparse.com/blog/url-extractor-get-urls-from-hyperlinks-in-a-web-page/?re=](https://www.octoparse.com/blog/url-extractor-get-urls-from-hyperlinks-in-a-web-page/?re=)
This is a quick guide to help you pull down a list of URLs or a list of data on a web page into excel using Octoparse. Is this the URL extractor you are looking for? Let’s see.
# URL Extractor / List Extractor
I am not sure if you have an idea about what is a roundup article, but you must have read one, and most likely you have read something that you want to save for future use.
Take this article’s [100 infographic submission sites](https://revuwire.com/submit-infographics-100-infographic-submission-sites/) as an example. If I am an SEO marketer and one day I come across this roundup post, what would come to my mind is like:
“Hey, look at this. I can pull these websites’ URLs down to a table and every time I have created a new infographic, I am going to submit it to these websites. This definitively could help boost my website traffic or at least number of backlinks.”
Yea, this is what the URL extractor can do. I am going to do this with a web scraping tool, Octoparse, in a few seconds.
# Scrape URLs in a Web Page
This is a simple example of how you can scrape a list of URLs from a web page into excel. Octoparse can scrape all kinds of structured data from web pages efficiently.
If you are looking to scrape other than URL data, more cases will be introduced in a video later. The video would help too if you find this textual tutorial boring.
## Prerequisites
* [Download Octoparse](https://www.octoparse.com/download) and install
* Sign up and log in
* A target URL ([example](https://revuwire.com/submit-infographics-100-infographic-submission-sites/)) to scrape a list of URLs from
When you enter the target URL into Octoparse, the web page will be rendered in the built-in browser. You will be able to browse it as if you are surfing on Chrome. One thing that differs from it is you can click and build a scraper while you are browsing.
https://preview.redd.it/cqhzqmw022881.png?width=512&format=png&auto=webp&v=enabled&s=33a1129c5c79a8bd94d36961f4cfa2a59aa1b42c
## Step-by-step Guide
* Enter the target URL into Octoparse
* Click the first hyperlink in the list
* Click the second hyperlink in the list
(The whole list of infographic websites will be selected in green)
* Click “Extract both text and URL of the link”
(Now data can be previewed in the table)
* Click “Create Workflow”
* Click the blue-button “Run” above
https://preview.redd.it/6w1odlxg32881.png?width=512&format=png&auto=webp&v=enabled&s=529abaf0de269f518fa90ac6cf64437ca82d8dd8
That’s it. After a few clicks, you have built and run your URL extractor and get all of the 100 links into excel for your use.
## Use Auto-detection
If you find that after clicking a few pieces of data, the whole list on the web page is not selected automatically by Octoparse, maybe you need to find another method to do this.
You can try Octoparse’s auto-detection feature and let the AI algorithm select the data for you. If this is not working as well, well, the website you are scraping from is unique. It is not an average type. It has a structure, not recognizable to the bot.
https://preview.redd.it/scs9wiyj32881.png?width=512&format=png&auto=webp&v=enabled&s=7ec313b042273b8793d89cde3df8ca5482671278
In this case, you need to amend the Xpath and locate the data accurately. Curious about [how to write an Xpath](https://www.octoparse.com/blog/what-is-xpath-and-how-to-use-it-in-octoparse)? You are getting onboard web scraping then.
Hey, don’t worry. Just assume your website is well-structured and test it with auto-detection.
Maybe you can get more than you expect. That’s possible. The AI algorithm is not omnipotent but it is powerful enough to cover most types of web pages.
In this video, you will also see how powerful auto-detection is and how it helps scrape travel data from Lonely planet effortlessly.
# Octoparse: Boost Your Working Efficiency
If you are a digital marketer and have no idea about web scraping, this is a good chance for you to learn something new. I am a marketer and as I get hold of this web scraping tool, I collect data at a rate that I can never do manually.
That means:
* You can grab articles and news for your content creation ([http://www.dataextraction.io/?p=1167](http://www.dataextraction.io/?p=1167)).
* You can bulk download data from your competitors, always keep yourself informed.
* You can pull valuable resources down to excel and make it an actionable working plan ([https://medium.com/dataseries/a-long-story-how-i-found-youtube-kols-for-marketing-purposes-5e8ef88bbb71](https://medium.com/dataseries/a-long-story-how-i-found-youtube-kols-for-marketing-purposes-5e8ef88bbb71)).
https://preview.redd.it/junju8en32881.png?width=512&format=png&auto=webp&v=enabled&s=b4c0f043e8164f491fbf0ead1efd82a2c7d32276
And a no-code web scraping tool is extremely friendly to a marketer, or anyone without coding knowledge who needs data.
[Give it a try.](https://www.octoparse.com/download)
rpjz0k
u_Octoparseideas
Octoparseideas
t3_rpjz0k
https://www.reddit.com/r/u_Octoparseideas/comments/rpjz0k/url_extractor_get_urls_from_hyperlinks_in_a_web/
12/27/2021 9:46:30 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
URL Extractor: Get URLs from Hyperlinks in A Web Page
False
1
rpjz0k
0
32057
3
3
18
2.15827338129496
3
0.359712230215827
0
0
379
45.4436450839329
834
Red
10
Dash Dot Dot
20
No
997
Posted
10/8/2021 7:18:36 AM
Participate the event on Twitter and win FREE gifts: [https://hubs.la/H0Z13tf0](https://hubs.la/H0Z13tf0)
https://preview.redd.it/iu4t1j4eg6s71.png?width=1600&format=png&auto=webp&v=enabled&s=86c15488f53b75c61bab5b2ebffe76aba067be5b
🌟Octoparsing with Zapier🌟
3 steps to win gifts worth $270:
1. Connect Octoparse with Zapier & Export your cloud data to any app.
2. Take a screenshot showing the successful data export.
3. Share your screenshot & feedback quoting the event tweet.
Join us and play!
q3spil
u_Octoparseideas
Octoparseideas
t3_q3spil
https://www.reddit.com/r/u_Octoparseideas/comments/q3spil/participate_in_the_octoparsing_with_zapier_event/
10/8/2021 7:18:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Participate in the "Octoparsing with Zapier" event to win FREE gifts
False
1
q3spil
0
32057
3
3
5
7.93650793650794
1
1.58730158730159
0
0
32
50.7936507936508
63
Red
10
Dash Dot Dot
20
No
996
Commented
10/8/2021 7:21:58 AM
\- Gold: Amazon Gift Card $20 + Custom Crawler Coupon $250
\- Silver: Amazon Gift Card $10
\- Bronze: Amazon Gift Card $5
Find the tutorial here: https://helpcenter.octoparse.com/hc/en-us/articles/4406338353689-How-to-Connect-Octoparse-with-Zapier
hftujgy
u_Octoparseideas
Octoparseideas
t1_hftujgy
https://www.reddit.com/r/u_Octoparseideas/comments/q3spil/participate_in_the_octoparsing_with_zapier_event/hftujgy/
10/8/2021 7:21:58 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
q3spil
t3_q3spil
q3spil
0
q3spil
True
False
False
0
32057
3
3
1
4.34782608695652
0
0
0
0
19
82.6086956521739
23
Red
10
Dash Dot Dot
20
No
995
Posted
8/23/2022 8:29:15 AM
Nearly all social media influencers are on Twitter, where you can also find their huge following lists with a mountain of data that you can use for analytics, promotion, and insight. Unlike other social media sites, Twitter actively allows people to use and do good with its public data.
Therefore, this article will assist you in how to scrape Twitter followers’ information via Python Twint and no-coding methods.
[https://www.octoparse.com/blog/web-scraping-job-postings/?utm\_source=2022q3&utm\_medium=how-to-scrape-twitter-followers&utm\_campaign=reddit](https://www.octoparse.com/blog/web-scraping-job-postings/?utm_source=2022q3&utm_medium=how-to-scrape-twitter-followers&utm_campaign=reddit)
wvj7lu
Octoparse_ideas
Octoparseideas
t3_wvj7lu
https://www.reddit.com/r/Octoparse_ideas/comments/wvj7lu/how_to_scrape_twitter_followers_and_export_in/
8/23/2022 8:29:15 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How To Scrape Twitter Followers and Export in Excel
False
1
wvj7lu
0
32057
3
3
1
0.87719298245614
0
0
0
0
68
59.6491228070175
114
Red
10
Dash Dot Dot
20
No
994
Posted
9/30/2021 7:18:56 AM
Nowadays companies use an ATS (Application Tracking System) to post jobs and find the perfect candidates. But in a competitive industry like healthcare, candidates won’t necessarily come to your company’s career site and search for jobs. HR personnel would then copy their job lists and go to different career websites and post them. However, there are few problems with this manual approach. First, it requires a lot of manual work. When it comes to copy and paste for different career websites, it often involves clerical error. As a result, candidates get confused and give a listing a pass. Second, your posts on different sites won’t be updated. What it means is that once you change little details on the job description. You will have to manually go back and amend each post.
As a result, it is not hard to imagine even if you have good company, you don’t have the best fishing net to capture all the best candidates. However, there is an excellent tool that saves HR a ton of time as well as sourcing as many good candidates as possible from different career sites. A scraping tool like Octoparse is the answer to creating an automated job board that integrates with a company’s ATS and pushing the latest listings to different career sites.
Breathing is automated. It makes our life so easy. We realize its importance, even more, when we see people going on ventilators. But are your HR operations and processes automated? Is your job creation, listing, interviewing, and hiring process fully optimized?
We ask these questions as we see a lot of companies are unnecessarily struggling. Several enterprises are still manually posting their vacancies to different job boards. We at Octoparse believe it’s quite orthodox in the age of AI & data science.
“The world is changing whether you like it or not. Get involved or get left behind.” — Dave Waters
And so, in this article, we emphasize the importance of automating some of the cumbersome HR processes & operations. We shall demonstrate the use of Web crawling to scrape job listings from career sites. Then, we shall post scraped job data automatically to job boards of your choice using XML job feeds.
We provide you with a seamless job wrapping solution stack consisting of an Applicant tracking system (ATS), Octoparse Scraping services, and XML feeds. This would be easily integrable into your existing HR tech stack.
https://preview.redd.it/fhl687y3dlq71.png?width=512&format=png&auto=webp&v=enabled&s=066d058077526d3724f0649ecaeec3c2bd32be0a
Not to mention, this relieves the HR team from manual work and enables them to focus on more critical aspects of the hiring cycle i.e., creating a perfect job description, training & onboarding the talent etc.
[Keep reading: What is an Application Tracking System (ATS)?](http://www.dataextraction.io/?p=1135/?re=)
pyepi7
Octoparse_ideas
Octoparseideas
t3_pyepi7
https://www.reddit.com/r/Octoparse_ideas/comments/pyepi7/automate_job_feed_scraping_posting_to_scaleup/
9/30/2021 7:18:56 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Automate Job Feed Scraping & Posting To Scale-Up Your Business
False
1
pyepi7
0
32057
3
3
Red
10
Dash Dot Dot
20
No
993
Posted
5/30/2022 9:05:57 AM
Nowadays people use PDF on a large scale for reading, presenting, and many other purposes. And many websites store data in a PDF file for viewers to download instead of posting on the web pages, which brings challenges to web scraping. You can view, save and print PDF files with ease. But the problem is, PDF is designed to keep the integrity of the file. It is more like an "electronic paper" format to make sure contents would look the same on any computer at any time. So it is difficult to edit a PDF file and export data from it.
Fortunately, there are some solutions that help extract data from PDF into Excel.
[https://www.octoparse.com/blog/how-to-extract-pdf-into-excel/?utm\_source=sale2022&utm\_medium=extractdatafrompdf&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-extract-pdf-into-excel/?utm_source=sale2022&utm_medium=extractdatafrompdf&utm_campaign=reddit)
v0w2iw
u_Octoparseideas
Octoparseideas
t3_v0w2iw
https://www.reddit.com/r/u_Octoparseideas/comments/v0w2iw/how_to_extract_data_from_pdf_to_excel_without/
5/30/2022 9:05:57 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Extract Data from PDF to Excel Without Coding skills
False
1
v0w2iw
0
32057
3
3
Red
10
Dash Dot Dot
20
No
992
Posted
2/1/2021 6:24:59 AM
[removed]
l9wpml
u_Octoparseideas
Octoparseideas
t3_l9wpml
https://www.reddit.com/r/u_Octoparseideas/comments/l9wpml/february_2021_has_arrived/
2/1/2021 6:24:59 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
February 2021 Has Arrived!
False
1
l9wpml
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
991
Posted
12/16/2021 2:47:06 AM
https://youtu.be/tQEy8aqyfUE
rhghwp
Octoparse_ideas
Octoparseideas
t3_rhghwp
https://www.reddit.com/r/Octoparse_ideas/comments/rhghwp/how_to_scrape_linkedin_jobs_with_octoparse/
12/16/2021 2:47:06 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape LinkedIn jobs with Octoparse
False
1
rhghwp
0
32057
3
3
Red
10
Dash Dot Dot
20
No
990
Posted
5/27/2022 9:34:53 AM
Hi friends! Talk about Yelp scraping, most people try to gather local business data such as the business name, contact number, website, business hours and so on. Today, we are going to show you how to scrape yelp business data by using Octoparse within easy steps.
[https://youtu.be/9UBhUQhJTGE](https://youtu.be/9UBhUQhJTGE)
uyuiq6
u_Octoparseideas
Octoparseideas
t3_uyuiq6
https://www.reddit.com/r/u_Octoparseideas/comments/uyuiq6/how_to_scrape_yelp_business_data/
5/27/2022 9:34:53 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Yelp business data
False
1
uyuiq6
0
32057
3
3
Red
10
Dash Dot Dot
20
No
989
Posted
9/9/2021 2:20:35 AM
[https://www.octoparse.com/blog/whats-new-about-octoparse-842/?re=](https://www.octoparse.com/blog/whats-new-about-octoparse-842/?re=)
Octoparse users, how’s your web scraping journey with the software? In late September, version 8.4.2 of the product will be released. Want to know what’s new about the coming latest version? Keep reading!
# 1. Zapier integration
In version 8.4.2, you can auto-export your cloud data with [Zapier](https://zapier.com/) to Google Drive, Google Sheet, and more software.
https://preview.redd.it/nou4jikr0em71.png?width=600&format=png&auto=webp&v=enabled&s=1ff3666a789b2ec01e6b1de676684464644a3c84
[Find more information here and have a try.](https://zapier.com/apps/google-drive/integrations/octoparse)
# 2. Scrape while scrolling within a certain section
Take Google Maps as an example. You can enter the webpage and scrape the search results only using this feature in version 8.4.2. The feature can be implemented by setting up the [Xpath](https://www.octoparse.com/blog/what-is-xpath-and-how-to-use-it-in-octoparse).
https://preview.redd.it/k1r033dt0em71.png?width=599&format=png&auto=webp&v=enabled&s=157c762a16a8c125110f1f0d630077e89335db19
# 3. Customize the user agent
You can change the user agent string and the user agent name on browsers when using version 8.4.2 to scrape data.
To understand how user agents work, this article can be helpful: [How to Change User Agents in Chrome, Edge, Safari & Firefox](https://www.searchenginejournal.com/change-user-agent/368448/)
# 4. Backup local data to the Cloud
This feature used to be available for enterprise users only. In the new 8.4.2 version, it is open to users with professional plans as well.
# 5. Formatting the timestamp
This feature is mainly designed for scraping social media platforms. [Converting the timestamp of the posts to date](https://timestamp.online/) is available in version 8.4.2.
# 6. Other updates in existing features and the UI
With the updates, version 8.4.2 will be more stable and convenient to use compared to former versions.
Don’t hesitate to contact us at [support@octoparse.com](mailto:support@octoparse.com) or [submit a ticket](https://helpcenter.octoparse.com/hc/en-us/requests/new) here if you have any questions. The customer service team will be ready to help you as always. Wish you an even happier scraping then!
pkonh4
Octoparse_ideas
Octoparseideas
t3_pkonh4
https://www.reddit.com/r/Octoparse_ideas/comments/pkonh4/whats_new_about_octoparse_842/
9/9/2021 2:20:35 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What’s New about Octoparse 8.4.2?
False
1
pkonh4
0
32057
3
3
11
3.01369863013699
2
0.547945205479452
0
0
168
46.027397260274
365
Red
10
Dash Dot Dot
20
No
988
Posted
9/6/2022 3:48:18 AM
Real estate scraping is now a thriving method to analyze potential buyers, consumer needs, optimized prices, and a bulk of other online information. Do you need to retrieve such info to improve your real estate business? Then this article will surely come in handy in this regard.
[https://www.octoparse.com/blog/real-estate-scraping/?utm\_source=2022q3&utm\_medium=real-estate-scraping&utm\_campaign=reddit](https://www.octoparse.com/blog/real-estate-scraping/?utm_source=2022q3&utm_medium=real-estate-scraping&utm_campaign=reddit)
x707cr
u_Octoparseideas
Octoparseideas
t3_x707cr
https://www.reddit.com/r/u_Octoparseideas/comments/x707cr/how_can_real_estate_scraping_improve_your_business/
9/6/2022 3:48:18 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How Can Real Estate Scraping Improve Your Business
False
1
x707cr
0
32057
3
3
4
4.65116279069767
0
0
0
0
47
54.6511627906977
86
Red
10
Dash Dot Dot
20
No
987
Posted
1/26/2022 9:46:02 AM
*Originally published as* [https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases/?re=](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases/?re=) *on January 24, 2022.*
A basic intro to lead you in the world of web scraping. What is web scraping? How does it work, how is it used? What are the pros and cons? All questions that concern you will be answered here.
# What is web scraping?
Web scraping is a way to download data from web pages.
You may have heard some of its nicknames like data scraping, data extraction, or web crawling. (web crawling could be narrower and refer to data scraping done by search engine bots) In most cases, they refer to the same meaning — a programmatic way to pull data from the web.
Web scraping helps fetch data (like emails, phone numbers, articles, etc.) from web pages and organize it into certain formats like Excel, CSV or HTML, etc.
See how [Wikipedia explains web scraping](https://en.wikipedia.org/wiki/Web_scraping):
>*“The content of a page may be parsed, searched, reformatted, its data copied into a spreadsheet or loaded into a database. Web scrapers typically take something out of a page, to make use of it for another purpose somewhere else. An example would be to find and copy names and telephone numbers, or companies and their URLs, or e-mail addresses to a list (contact scraping).”*
In essence, web scraping is a dedicated data collector who captures the exact set of data you want from a load of web pages and makes it into a neat file for your download and further use.
# What’s the point of web scraping?
Big Data and Automation are no longer new concepts in the current business world. They are widely used techniques to improve people’s effectiveness and efficiency.
Big data is big for the amount. Automation is about getting things done on autopilot. And web scraping is good at both — getting voluminous data fast with little human labor required.
In the context of big data collection, web scraping is the rescue. If you want to train a machine learning model, a great amount of accurate input data will make you smile. This data will teach your model important lessons and get you a more intelligent algorithm.
That’s when web scraping plays the ace — to grab you data efficiently from a number of websites and get it into a machine-readable format for quick use.
Well, not everyone has an AI model to train, but most of us need to collect data for different purposes. Web scraping’s nature of automation extremely improves people’s working efficiency and eliminates human errors. Lay back and let the robot do what is repetitive.
When you get to the [use cases#](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases#web_scraping2), you will find out how web scraping helps in real cases.
## How does web scraping work?
Web pages’ data is written in the HTML file. Browsers like Chrome and Firefox are tools that read the HTML file to us.
Therefore, no matter how diverse web pages are presented to us, every string of data we see in the browser is already written in the HTML source code. Whatever you see can be traced and located in the code (by Xpath, a language used to locate an element).
Web scraping finds the right data according to where it locates and takes a series of actions (such as extract the selected text, extract the hyperlink, input a preset data and click certain buttons, etc.) just like a human, except that it surfs the Internet, copy the data fast around the clock and feels no fatigue.
Once the data is ready, you will be able to download it from the cloud or to the local file for any further use.
# How is web scraping used?
Who is doing web scraping and how do they get empowered by web scraping? Here are some use cases. You may discover how web scraping could benefit you as well.
## Who is using web scraping?
Web scraping is widely used in industries like:
* jobs & recruitment
* consultancy
* hotel & travel
* eCommerce & retailing
* finance and more
* marketing
***Tips:*** *Check how to* [*get started#*](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases#web_scraping4)*, an example of how I started to build my web scraper and got data from Youtube that helped my KOL marketing.*
They are getting data mostly for price/brand monitoring, price comparison, and big data analysis that serve their decision-making process and business strategy.
For individuals, web scraping helps professionals like:
* data scientists
* data journalists
* marketers
* academic researchers
* business analysts
* eCommerce sellers and more
(to obtain data that support their sales, marketing, research, and analysis.)
Does web scraping sound like a big undertaking to you? Believe me, it is not. It can be used in many trivial ways and help you out of tedious, repetitive work. Basically, if you need data that could be found in websites and you don’t want to do mind-numbing copies and pastes manually, you use web scraping.
**Read also:**
* [How Dealogic Gets Empowered with Content Aggregation](https://service.octoparse.com/dealogic-web-scraping-for-content-aggregation)
* [Ecommerce Product Tracking for Successful Reselling](https://service.octoparse.com/amazon-product-monitoring)
* [Web Scraping In Marketing Consultancy](https://service.octoparse.com/pricetrack-consultancy-web-scraping)
* [Web Scraping Manages Inventory Tracking in Retail Industry](https://service.octoparse.com/inventory-web-scraping-blind-rivet-supply)
## What are the most scraped data/websites?
According to [the Most Scraped Websites](https://www.octoparse.com/blog/top-10-most-scraped-websites) by Octoparse, eCommerce marketplaces, directory websites and social media platforms are the most scraped websites in general.
**Websites like Amazon, eBay, Walmart, Yelp, Yellowpages, Craigslist, social media platforms like Facebook, Twitter and LinkedIn are among the popular.**
What data are people getting from these sites? Well, everything that serves their research or sales.
* Online product details like stock, prices, reviews and specifications;
* Business/leads information like stores’ or individuals’ name, email, address, phone number and other information that serve any outbound gestures;
* Discussions on the social media or comments on the review pages that offer data sources for NLP or sentiment analysis.
The need of migrating data is also one of the reasons people choose web scraping. A scraper then works out like a grand CTRL+C action and helps copy data from one place to another for the user.
You may be interested in [web scraping business ideas](https://www.octoparse.com/blog/10-web-scraping-business-ideas-for-everyone) to discover more detailed information about how web scraping is used in practical scenarios.
# The Pros and Cons of Web Scraping
Because of its accuracy and efficiency, web scraping empowers individuals and businesses in many ways. However, worries always exist — will it be too complicated to handle? Is it hard to fix and maintain, etc. Well, fair questions. While if you got the opportunity to dive into it, you will see the advantages of web scraping very likely outweigh what means to you the tricky part.
## The advantages of web scraping
**#High speed**
Getting data faster. This is self-evident and may be the core reason people resort to web scraping. Compared to manually doing this, a web scraper can execute your commands automatically, according to the workflow you have built for it. Each step of work that would have taken up your time will be done by the scraper.
Once you set it up, it will run for you relentlessly, getting all kinds of web data fast from different websites. If you wanna try how fast a scraper can be, I recommend you to try our [scraper templates](https://helpcenter.octoparse.com/hc/en-us/articles/900003158843). You may try an Amazon scraper to gather product details or product reviews and see how a scraper can get you hundreds of well-structured data lines in just a minute.
[Download Octoparse](https://www.octoparse.com/download) to witness the speed of web scraping.
BTW, web scraping is a valua...
sd2ah2
u_Octoparseideas
Octoparseideas
t3_sd2ah2
https://www.reddit.com/r/u_Octoparseideas/comments/sd2ah2/what_is_web_scraping_basics_practical_uses/
1/26/2022 9:46:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What Is Web Scraping — Basics & Practical Uses
False
1
sd2ah2
0
32057
3
3
46
3.36503291880029
17
1.24359912216533
0
0
703
51.4264813460132
1367
Red
10
Dash Dot Dot
20
No
986
Posted
3/7/2022 9:23:45 AM
*Originally published as* [*https://www.octoparse.com/blog/b2b-lead-generation-top-10-tools-for-digital-marketing/?re=*](https://www.octoparse.com/blog/b2b-lead-generation-top-10-tools-for-digital-marketing/?re=) *on March 7, 2022.*
B2B Lead generation is important for growing your business as it decides if your salespeople are going to close more deals or not.
In this article, you will learn about what B2B lead generation is, and how to conduct it effectively using some of the best tools in the market.
https://preview.redd.it/15igho89jxl81.png?width=1350&format=png&auto=webp&v=enabled&s=669f06e0fbae0aee284ac80f4ef33bfa8f684f00
# What is B2B lead generation?
B2B [Lead generation](https://en.wikipedia.org/wiki/Lead_generation) is known as the process of identifying and attracting potential customers.
These people may be some random visitors who just happened to bump into your website (thanks to your SEO team) or people that are actually looking for certain help.
What they have in common is that they fit your customer persona — they are very likely looking for your product to help them get out of a dilemma! These people shall be the target of your marketing and sales team and definitely shall be absorbed into your lead generation system, or marketing-sales system, whatever you call it.
Lead generation opens the door to more sales. Yet, in most cases, your lead is not the one who walks straight over to the cashier and pays right away. That’s where cultivation comes in — to lead your potential customer from the stage of awareness to purchase. So, educate them, tell them what it is and how it works.
In short, to turn leads into sales, you’ll need to:
* Attract more visitors
* Identify potential buyers
* Cultivate potential customers
* Initiate the sale and close deals (hurray!)
# What are B2B sales leads?
If lead generation is about converting a stranger who has never heard about your company before into a customer who pays for your bill, then when we run into one of these strangers, we’d better be well aware of where he is — in the sales funnel.
People in different stages require customized attention. Handing over every potential lead to your sales team is not going to be an effective way to achieve more sales. There are good reasons why leads are generally categorized into two types: MQL (marketing qualified leads) and SQL (sales qualified leads).
[Source: https:\/\/www.digitalbrew.com\/6-sales-funnel-stages-video-amplifies\/](https://preview.redd.it/0dnjjvjbjxl81.png?width=1342&format=png&auto=webp&v=enabled&s=12d47010779a069ff3e994ffe15bcc5745875b4d)
## What is MQL?
A marketing qualified lead lingers at the stage of awareness or interest. This is a group of people who are relevant to you but yet not ready to purchase.
People with these behaviors may be recognized as an MQL:
* Visit your website regularly
* Download your ebook or white paper
* Sign up for your next webinar
* Send emails or start an online chat
* Subscribe to your newsletter
That’s how people show their interests.
The marketing team shall help dig out their needs or escort them through the evaluation stage with a clear demo (or other resources) that will answer their worries perfectly.
## What is SQL?
A sales qualified lead is a consumer who is ready to buy. They know what they need and they are studying the specification, doing some comparisons until they are able to pick the best candidate and take out the wallet and say: “wrap it up”.
A sales qualified lead may:
* Ask for a price list with a full plan breakdown
* Send an email with concrete questions about what your services can do or not
* Schedule a phone call for a full product demo
* Submit a form that describes what challenges he is facing to the sales team to connect
* Enter his credit card info and sign up for a trial
They are prospects the salespeople would love to talk to (also who they shall talk to). Time to take quick action! Go dive deep into what they are looking for and draw out a custom solution that fits their needs or to make it further, go through a POC (proof of concept) to prove the capability of what you are offering.
# How to conduct lead generation that leads to sales?
This is teamwork — a marketing team to attract leads and cultivate and a sales team to select leads, communicate and make deals.
If we focus on lead generation only, the concept in many ways overlaps with what we call “inbound marketing” — to attract the MQLs and SQLs to your system by placing valuable content on where they are, setting gated resources so as to gather their information that feeds your sales system.
These materials would play a big part in lead generation:
* Ebooks that teach your customers how to get started
* Reports that uncover the trends and opportunities in the market
* Webinars that guide your customers through the difficult problems they encounter
* Landing pages that introduce specific solutions that your company offer
* Newsletters that inform them about industry news and updates
# 10 Top lead generation software and tools to grow your business
B2B lead generation is difficult, yet, having the right tools can make a big difference. Let’s take a look at some of the best tools out there that be used to work your way up through the entire lead generation process.
## #1 — SurveyMonkey
**Get to know your current customers**
SurveyMonkey is a questionnaire software that can help you learn about who are your customers and what do you love and hate about your business or products.
[SurveyMonkey](https://apply.surveymonkey.com/referral-demo/?grsf=hg4r6n) provides a free version with limited services:
* 10 questions
* 100 respondents
* 15 question types
* Light theme customization and templates
## #2 — Ahrefs
**Learn from your competitor**
Ahrefs is a search engine optimization (SEO) tool that helps analyze practically everything about your websites and at the same time, your competitors. Using Ahref, you can find out what your competitors have been doing to get leads, such as the keywords they are using and more. There’s a lot to dig out for digital marketers from the auto-generated report and get some good ideas for the next marketing campaigns.
[Ahrefs](https://ahrefs.com/pricing) doesn’t offer a free version but gives an extra 2-month period for annual billing. The functions of the paid version (From $99.95 per month) are:
* Rank Tracker
* Site Audit
* Site Explorer
* Content Explorer
* Keywords Explorer
* Alerts
## #3–SEMrush
**Optimize your site to get more organic traffic**
Compared with Aherfs, SEMrush is an SEO tool that focuses more on website self-checking. With it, we can quickly locate the problems of the sites, like broken links, incorrect tabs, and so on. Google spider would love to rank you higher for the ongoing website optimization.
Same as Ahrefs, [SEMrush](https://www.semrush.com/lp/sem/en/?ref=8131476429&utm_campaign=aio_%20campaign&utm_source=berush&utm_medium=promo&utm_term=23) only offers paid versions (From $99.95 per month). It provides:
* Branded reports
* Historical Data
* Extended limits
* White-label reports
* API access
* Extended limits and sharing options
* Google Data Studio Integration
* …..
## #4–Mailchimp
**Leads generation with email marketing**
[Based on 2018 data](https://www.smartinsights.com/email-marketing/email-communications-strategy/email-marketing-still-worth-taking-seriously-2018/), email marketing, the old fashion way, still works effectively for marketing today. Despite the reasons, it’s much important to figure out how an email marketing automation platform, like MailChimp, brings more business leads.
Digital marketers can use Mailchimp to cover the whole process of a marketing campaign by scheduling emails or setting up trigger emails or Drip-Feed emails.
specifically, for the e-Commerce business, MailChimp is available for link tracking, offering a relatively complete report on the total orders and the average revenue brought by the email campaign.
[Mailchimp](https://app.mobilemonkey.com/signup?ref=4GAWZw) offers a robust free version for customizing the layout of an e...
t8leck
u_Octoparseideas
Octoparseideas
t3_t8leck
https://www.reddit.com/r/u_Octoparseideas/comments/t8leck/how_to_conduct_b2b_lead_generation_10_tips_and/
3/7/2022 9:23:45 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Conduct B2B Lead Generation | 10 Tips and Tools
False
1
t8leck
0
32057
3
3
73
5.50943396226415
15
1.13207547169811
0
0
649
48.9811320754717
1325
Red
10
Dash Dot Dot
20
No
985
Posted
3/30/2022 9:09:51 AM
[https://www.octoparse.com/blog/what-do-you-know-about-a-screen-scraper/?re=](https://www.octoparse.com/blog/what-do-you-know-about-a-screen-scraper/?re=)
Screen scraping is a data collecting technique usually used to copy information that shows on a digital display so it can be used for another purpose. In this article, we will introduce the process of screen scraping and how a screen scraper works.
https://preview.redd.it/0wb33hj0mhq81.jpg?width=1080&format=pjpg&auto=webp&v=enabled&s=65b9eb4a0fca91bf189f6e5c9dfc38af219ea045
ts2ar6
Octoparse_ideas
Octoparseideas
t3_ts2ar6
https://www.reddit.com/r/Octoparse_ideas/comments/ts2ar6/what_is_screen_scraping_and_how_does_it_work/
3/30/2022 9:09:51 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What Is Screen Scraping and How Does It Work?
False
1
ts2ar6
0
32057
3
3
1
1.40845070422535
0
0
0
0
34
47.887323943662
71
Red
10
Dash Dot Dot
20
No
984
Posted
3/2/2022 2:08:59 AM
*Originally published as* [*https://www.octoparse.com/blog/data-collection-from-websites/?re=*](https://www.octoparse.com/blog/data-collection-from-websites/?re=) *on March 1, 2022.*
How to collect data from websites? With the technology of web scraping, automation, and RPA, data collection can go way deeper than just bringing together copies of data. As the old saying goes, a good start is half of the success. In this article, we’ll focus on the data collection part of it, specifically, why do people collect web data, and how to get it done effectively.
https://preview.redd.it/e2dk0742pvk81.png?width=1080&format=png&auto=webp&v=enabled&s=094b0d803e1b4f64df5641df18e00ce623729884
# What’s Data Collection
Data collection is the process of collecting information from one or more sources in a systematic way. In fact, this is still a vague definition and data collection practices can vary a lot in different circumstances.
Regardless of how different they are, as long as the project is well defined, some things are in common:
* The collecting process is usually systematic in one way or another. Tools are often used to carry out the process.
* The data collected shall be transformed to the formats of the platform in which it is going to be processed.
Here is a definition by Wikipedia (more in a research context):
>*Data collection is the process of gathering and measuring information on targeted variables in an established systematic fashion, which then enables one to answer relevant questions and evaluate outcomes.*
# What’s the Goal of Data Collection?
* Through data collection, we can capture high-quality evidence for the building of convincing and credible answers to questions that have been raised. (Academic research is a typical example.)
* Businesses may want to use the collected web data to build profitable services or to get a panoramic view of the market.
* Companies may need to collect data for data migration purposes
* See [What People Scrapes When They Scrape the Web](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases#web_scraping2) for a more comprehensive view of what people are doing with the scraped data
Many companies need to extract data from websites to meet their various needs. But during the process of collecting data from websites, they may run into problems like collecting irrelevant or duplicate data, having insufficient time or budget, being lack of [useful tools](https://www.octoparse.com/blog/top-20-web-crawling-tools-for-extracting-web-data), or failing to extract dynamic data.
Well problems exist, so as solutions. Before getting ourselves frustrated, the first thing we can do is to [make a data collection plan](https://www.dimagi.com/data-collection/):
1. Define your project goal
2. Clarify your data requirement
3. Decide the data collection approach
4. Carry out the process
# Data Collection Approaches
When collecting data from the web, you’ll need at least two things handy: a useful data collection tool and a list of data sources.
## Data sources: websites for data collection
Some websites offer rich statistics data for visitors to download and they could be valuable data sources for researchers. For your reference, here is a list of [70 open data sources](https://www.octoparse.com/blog/big-data-70-amazing-free-data-sources-you-should-know-for-2021). These are websites owned by governments, organizations, and business service providers, ranging across various industries such as health, finance, crime, etc. Hopefully, you’ll find something you need.
## Web scraping tools to collect data from website
Tools can work wonders if you know how to use them effectively. Likewise, a no-code data collection software can help you get what you want exactly in a short period while it may take a long time for anyone to gather the information by copying and pasting
With the help of [data collection and analytics tools](https://www.octoparse.com/blog/top-30-big-data-tools-for-data-analysis), organizations are also able to collect data from mobile devices, website traffic, server activity, and other relevant sources, depending on the project.
[Web scraping](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases) is a powerful technique to download data from websites — all kinds of data including:
✅ [Text and articles](https://www.octoparse.com/blog/build-your-blog-fast-with-web-scraping)
✅ Numerical data
✅ [Tables](https://www.octoparse.com/blog/scrape-data-from-a-table)
✅ Listings
✅ [Images](https://www.octoparse.com/blog/how-to-get-images-from-any-website)
*Tips: Octoparse is a web scraping tool designed to gather website data without coding. Instead of learning Python from scratch, leveraging a no-code tool can get an easy start. If you have any specific data requirements, feel free to contact us at support@octoparse.com.*
# Big Data and Data Collection
Big data aims to help people gain insights through data analysis and make data-driven decisions. There’s no doubt that data collection builds the foundation for big data applications. Together with new technologies such as machine learning and artificial intelligence that use complex algorithms to look for repeatable patterns among the collected data, we are getting closer to the time when data can truly “speak” for itself.
**About Web Scraping:**
[What’s Web Scraping](https://www.octoparse.com/blog/what-is-web-scraping)
[How Does Web Scraping Work](https://www.octoparse.com/blog/web-scraping-introduction)
[Web Scraping Using Google Sheets](https://www.octoparse.com/blog/simple-web-scraping-using-google-sheets)
[Scraping Data from Websites Using Excel](https://www.octoparse.com/blog/scraping-data-from-website-to-excel)
[Best Web Scraper for Mac](https://www.octoparse.com/blog/best-web-scraper-for-mac)
t4owvw
u_Octoparseideas
Octoparseideas
t3_t4owvw
https://www.reddit.com/r/u_Octoparseideas/comments/t4owvw/collecting_data_from_websites_without_coding/
3/2/2022 2:08:59 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Collecting Data from Websites without Coding Skills
False
1
t4owvw
0
32057
3
3
34
3.65591397849462
12
1.29032258064516
0
0
501
53.8709677419355
930
Red
10
Dash Dot Dot
20
No
983
Posted
6/16/2022 4:15:28 PM
TAKE 30% OFF when Renew or Upgrade
【Standard Year】SAVE $201!
【Professional Year】SAVE $500!
👉 Get free crawlers & 1-on-1 training: [https://www.octoparse.com/summer-sale-2022/?utm\_source=redditmiddle&utm\_medium=ing0617&utm\_campaign=22summersale](https://www.octoparse.com/summer-sale-2022/?utm_source=redditmiddle&utm_medium=ing0617&utm_campaign=22summersale)
https://preview.redd.it/0vt31cq1d0691.png?width=800&format=png&auto=webp&v=enabled&s=a4077e73d862ec05351a11e7aade773500bbf10f
vdpmh7
u_Octoparseideas
Octoparseideas
t3_vdpmh7
https://www.reddit.com/r/u_Octoparseideas/comments/vdpmh7/summer_sale_2022_still_live/
6/16/2022 4:15:28 PM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
👏 Summer Sale 2022 Still Live
False
1
vdpmh7
0
32057
3
3
1
1.78571428571429
0
0
0
0
37
66.0714285714286
56
Red
10
Dash Dot Dot
20
No
982
Posted
5/18/2022 3:36:24 AM
How much do you know about web scraping? Don't worry even if you are new to this concept. As in this article, we will brief you on the basics of web scraping, teach you how to assess web scraping tools to get one that best fits your needs, and last but not least, present a list of web scraping tools for your reference.
[https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss/?utm\_source=sale2022&utm\_medium=10freewebscrapers&utm\_campaign=reddit](https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss/?utm_source=sale2022&utm_medium=10freewebscrapers&utm_campaign=reddit)
us3vu2
u_Octoparseideas
Octoparseideas
t3_us3vu2
https://www.reddit.com/r/u_Octoparseideas/comments/us3vu2/10_free_web_scrapers_that_you_cannot_miss_in_2022/
5/18/2022 3:36:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10 FREE Web Scrapers That You Cannot Miss in 2022
False
1
us3vu2
0
32057
3
3
3
2.77777777777778
3
2.77777777777778
0
0
48
44.4444444444444
108
Red
10
Dash Dot Dot
20
No
981
Posted
6/10/2021 6:57:02 AM
[removed]
nwh8pk
u_Octoparseideas
Octoparseideas
t3_nwh8pk
https://www.reddit.com/r/u_Octoparseideas/comments/nwh8pk/how_to_scrape_realtime_data/
6/10/2021 6:57:02 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Real-Time Data
False
0.5
nwh8pk
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
980
Posted
11/11/2021 2:59:29 AM
Octoparse salutes Black Friday: [https://youtu.be/En7MS6lo8WQ](https://youtu.be/En7MS6lo8WQ)
This Black Friday with Octoparse! Lower price and new version!
Save up 30-40% from 11.17-12.03 2021 (23:59 EST.)
Extra 10-15% OFF on the first day 11.17th EST ONLY
And get free giveaways: crawler+training
New 8.4 new experience:
8.4 updates with cool new features: Add custom user agent, page scroll-down, Zapier integration
Faster engine, more intuitive layout, and robust exportation
Tune in Octoparse for the coming Black Friday and save big!
qrb7e4
u_Octoparseideas
Octoparseideas
t3_qrb7e4
https://www.reddit.com/r/u_Octoparseideas/comments/qrb7e4/octoparse_salutes_black_friday/
11/11/2021 2:59:29 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse salutes Black Friday
False
1
qrb7e4
0
32057
3
3
Red
10
Dash Dot Dot
20
No
979
Posted
10/19/2021 2:42:31 AM
Octoparse users, how's your journey with the product?
Share your story with Octoparse with hashtag "[\#OctoparseinYourArea](https://www.linkedin.com/feed/hashtag/?keywords=octoparseinyourarea&highlightedUpdateUrns=urn%3Ali%3Aactivity%3A6856055699533918208)" on Facebook / Twitter / LinkedIn, and win the FREE gift pack of all scrapers created in the video series "Learn from Community"([https://hubs.la/H0ZCX\_j0](https://hubs.la/H0ZCX_j0))!
Come and participate in the event!
qb25q5
Octoparse_ideas
Octoparseideas
t3_qb25q5
https://www.reddit.com/r/Octoparse_ideas/comments/qb25q5/event_octoparseinyourarea/
10/19/2021 2:42:31 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Event - #OctoparseinYourArea
False
1
qb25q5
0
32057
3
3
2
3.03030303030303
0
0
0
0
38
57.5757575757576
66
Red
10
Dash Dot Dot
20
No
978
Posted
10/29/2021 8:34:34 AM
[https://www.octoparse.com/blog/introducing-the-new-octoparse-84/?re=](https://www.octoparse.com/blog/introducing-the-new-octoparse-84/?re=)
Welcome to the new version 8.4 of Octoparse. Isn’t it excited? Let me walk you through and take a look at the latest version 8.4!
# #1 More intuitive to use with new layout
In this version, the workflow is switched from the left to the right side. This makes it more efficient to edit the workflow while looking at the built-in browser. Next, the new outlook combines the designs from the previous version with an advanced setting area sitting at the corner. Compared with the previous version which had them hidden inside gear icons under each action, the new look has more room which makes it easier for users to define the actions.
https://preview.redd.it/4ykainzpocw71.png?width=600&format=png&auto=webp&v=enabled&s=7943d07327721261984f9a5f34d1ffd33bbdf71d
In addition, you can even choose the views by switching horizontally or vertically. Horizontal view lets you view the data row by row together, whereas the vertical view will preview a specific line and see all the data fields vertically. This view is helpful when you need to edit a batch of data fields at the same time.
https://preview.redd.it/nrxw1xqrocw71.png?width=600&format=png&auto=webp&v=enabled&s=9de6fc3093c6d87ace119b43ab258c1346ecb937
https://preview.redd.it/v9zd07htocw71.png?width=600&format=png&auto=webp&v=enabled&s=1a83318b540488bf63d505e1e6b222af248461ed
# #2 Optimized Rendering Engine
With continuous improvement, Octoparse 8.4 is like an almighty browser that allows you to switch between the most popular engines for different websites.
For instance, Chrome is more compatible with sites like Google Maps. Whereas, Firefox consumes less memory and performs the best on FTP sites. If you encounter a problem displaying the webpage, you can always switch to other engines from the setting area.
Octoparse 8.4 implements the Webview technique inside the browser which shows an excellent ability of antifreeze. The page frozen issues have been fixed in the latest update with robust compatibility.
In addition, the browser now allows users to interact with partial web content. For example, you may find it difficult to scrape from Google Maps as it has very dynamic scrolling techniques for partial-page content instead of the entire page.
https://preview.redd.it/kcqigxpvocw71.png?width=600&format=png&auto=webp&v=enabled&s=e7b7525d10aef86c557014b1806fd27d8ee3caaf
Click here to scrape while [scrolling within a certain section](https://helpcenter.octoparse.com/hc/en-us/articles/4406988884249-Scrolling-within-a-designated-area-of-a-web-page)
# #3 Integration
With the new version, You can use [Zapier](https://zapier.com/) to integrate [Octoparse ](https://helpcenter.octoparse.com/hc/en-us/sections/4406348754329-Data-Export)with thousands of apps like Google Drive, Google Sheet, Slack, and many more. This allows you to achieve automation without using complex API calls and you can even share the data with your team members at any time.
https://preview.redd.it/mt4ua1mxocw71.png?width=600&format=png&auto=webp&v=enabled&s=158c80db912f12eb1acd3ee03f37b1284d7354bb
[Find more information here and have a try.](https://zapier.com/apps/octoparse/integrations)
# #4 Other tutorial of new updates
## Formatting the timestamp
This feature is mainly designed for scraping social media platforms. [Converting the timestamp of the posts to date](https://timestamp.online/) is now available in version 8.4.
## Backup local data to the Cloud
This feature used to be available for enterprise users only. In the new 8.4.2 version, it is open to users with professional plans as well.
## Customize the user agent
You can change the user agent string and the user agent name on browsers when using version 8.4.2 to scrape data.
Learn about how to [add a custom user agent](https://helpcenter.octoparse.com/hc/en-us/articles/4407023809177-Add-custom-User-Agent) in OP 8.4.
Alright, it is not the end. Check out our 8.4 video to have a straight view and detailed information about the new version!
[https://youtu.be/TPiZ8H7rNdk](https://youtu.be/TPiZ8H7rNdk)
As always, don’t hesitate to contact us at [support@octoparse.com](mailto:support@octoparse.com) or [submit a ticket](https://helpcenter.octoparse.com/hc/en-us/requests/new) here if you have any questions. The customer service team will be ready to help you as always. Wish you an even happier scraping then!
qi87it
u_Octoparseideas
Octoparseideas
t3_qi87it
https://www.reddit.com/r/u_Octoparseideas/comments/qi87it/introducing_the_new_octoparse_84/
10/29/2021 8:34:34 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Introducing the New Octoparse 8.4
False
1
qi87it
0
32057
3
3
22
3.3587786259542
7
1.06870229007634
0
0
312
47.6335877862595
655
Red
10
Dash Dot Dot
20
No
977
Posted
2/7/2021 1:48:54 AM
[removed]
lechaj
content_marketing
Octoparseideas
t3_lechaj
https://www.reddit.com/r/content_marketing/comments/lechaj/where_can_i_find_experienced_techseo_writers/
2/7/2021 1:48:54 AM
1/1/0001 12:00:00 AM
False
False
17
2
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Where can I find experienced tech/SEO writers?
False
0.92
lechaj
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
976
Posted
3/2/2022 2:16:11 AM
*Originally published as* [*https://www.octoparse.com/blog/data-collection-from-websites/?re=*](https://www.octoparse.com/blog/data-collection-from-websites/?re=) *on March 1, 2022.*
How to collect data from websites? With the technology of web scraping, automation, and RPA, data collection can go way deeper than just bringing together copies of data. As the old saying goes, a good start is half of the success. In this article, we’ll focus on the data collection part of it, specifically, why do people collect web data, and how to get it done effectively.
https://preview.redd.it/4szxbhd8qvk81.png?width=1080&format=png&auto=webp&v=enabled&s=9cfbdcc6b9f1ec0af703b97919ffab3e70a8b5f6
# What’s Data Collection
Data collection is the process of collecting information from one or more sources in a systematic way. In fact, this is still a vague definition and data collection practices can vary a lot in different circumstances.
Regardless of how different they are, as long as the project is well defined, some things are in common:
* The collecting process is usually systematic in one way or another. Tools are often used to carry out the process.
* The data collected shall be transformed to the formats of the platform in which it is going to be processed.
Here is a definition by Wikipedia (more in a research context):
>*Data collection is the process of gathering and measuring information on targeted variables in an established systematic fashion, which then enables one to answer relevant questions and evaluate outcomes.*
# What’s the Goal of Data Collection?
* Through data collection, we can capture high-quality evidence for the building of convincing and credible answers to questions that have been raised. (Academic research is a typical example.)
* Businesses may want to use the collected web data to build profitable services or to get a panoramic view of the market.
* Companies may need to collect data for data migration purposes
* See [What People Scrapes When They Scrape the Web](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases#web_scraping2) for a more comprehensive view of what people are doing with the scraped data
Many companies need to extract data from websites to meet their various needs. But during the process of collecting data from websites, they may run into problems like collecting irrelevant or duplicate data, having insufficient time or budget, being lack of [useful tools](https://www.octoparse.com/blog/top-20-web-crawling-tools-for-extracting-web-data), or failing to extract dynamic data.
Well problems exist, so as solutions. Before getting ourselves frustrated, the first thing we can do is to [make a data collection plan](https://www.dimagi.com/data-collection/):
1. Define your project goal
2. Clarify your data requirement
3. Decide the data collection approach
4. Carry out the process
# Data Collection Approaches
When collecting data from the web, you’ll need at least two things handy: a useful data collection tool and a list of data sources.
## Data sources: websites for data collection
Some websites offer rich statistics data for visitors to download and they could be valuable data sources for researchers. For your reference, here is a list of [70 open data sources](https://www.octoparse.com/blog/big-data-70-amazing-free-data-sources-you-should-know-for-2021). These are websites owned by governments, organizations, and business service providers, ranging across various industries such as health, finance, crime, etc. Hopefully, you’ll find something you need.
## Web scraping tools to collect data from website
Tools can work wonders if you know how to use them effectively. Likewise, a no-code data collection software can help you get what you want exactly in a short period while it may take a long time for anyone to gather the information by copying and pasting
With the help of [data collection and analytics tools](https://www.octoparse.com/blog/top-30-big-data-tools-for-data-analysis), organizations are also able to collect data from mobile devices, website traffic, server activity, and other relevant sources, depending on the project.
[Web scraping](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases) is a powerful technique to download data from websites — all kinds of data including:
✅ [Text and articles](https://www.octoparse.com/blog/build-your-blog-fast-with-web-scraping)
✅ Numerical data
✅ [Tables](https://www.octoparse.com/blog/scrape-data-from-a-table)
✅ Listings
✅ [Images](https://www.octoparse.com/blog/how-to-get-images-from-any-website)
*Tips: Octoparse is a web scraping tool designed to gather website data without coding. Instead of learning Python from scratch, leveraging a no-code tool can get an easy start. If you have any specific data requirements, feel free to contact us at support@octoparse.com.*
# Big Data and Data Collection
Big data aims to help people gain insights through data analysis and make data-driven decisions. There’s no doubt that data collection builds the foundation for big data applications. Together with new technologies such as machine learning and artificial intelligence that use complex algorithms to look for repeatable patterns among the collected data, we are getting closer to the time when data can truly “speak” for itself.
**About Web Scraping:**
[What’s Web Scraping](https://www.octoparse.com/blog/what-is-web-scraping)
[How Does Web Scraping Work](https://www.octoparse.com/blog/web-scraping-introduction)
[Web Scraping Using Google Sheets](https://www.octoparse.com/blog/simple-web-scraping-using-google-sheets)
[Scraping Data from Websites Using Excel](https://www.octoparse.com/blog/scraping-data-from-website-to-excel)
[Best Web Scraper for Mac](https://www.octoparse.com/blog/best-web-scraper-for-mac)
t4p22h
Octoparse_ideas
Octoparseideas
t3_t4p22h
https://www.reddit.com/r/Octoparse_ideas/comments/t4p22h/collecting_data_from_websites_without_coding/
3/2/2022 2:16:11 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Collecting Data from Websites without Coding Skills
False
1
t4p22h
0
32057
3
3
34
3.65591397849462
12
1.29032258064516
0
0
501
53.8709677419355
930
Red
10
Dash Dot Dot
20
No
975
Posted
8/30/2022 4:10:32 AM
Google Maps is an excellent source for finding business leads/contacts. As data about all the businesses worldwide are available in Google Maps, it can be one of the go-to resources if you are researching local businesses and need their data.
In this blog post, we will discuss what data we can scrape from Google Maps and how to export Google Maps search results to an excel or CSV file.
[https://www.octoparse.com/blog/export-google-maps-search-results-to-excel/?utm\_source=2022q3&utm\_medium=export-google-maps-search-results-to-excel&utm\_campaign=reddit](https://www.octoparse.com/blog/export-google-maps-search-results-to-excel/?utm_source=2022q3&utm_medium=export-google-maps-search-results-to-excel&utm_campaign=reddit)
x19a3i
u_Octoparseideas
Octoparseideas
t3_x19a3i
https://www.reddit.com/r/u_Octoparseideas/comments/x19a3i/can_you_export_google_map_search_results_to_excel/
8/30/2022 4:10:32 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Can You Export Google Map Search Results to Excel
False
1
x19a3i
0
32057
3
3
8
6.34920634920635
0
0
0
0
69
54.7619047619048
126
Red
10
Dash Dot Dot
20
No
974
Posted
6/21/2021 4:25:32 AM
[removed]
o4mz0r
u_Octoparseideas
Octoparseideas
t3_o4mz0r
https://www.reddit.com/r/u_Octoparseideas/comments/o4mz0r/how_to_download_images_from_url_list/
6/21/2021 4:25:32 AM
1/1/0001 12:00:00 AM
False
False
2
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Download Images from URL List
False
0.75
o4mz0r
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
973
Posted
11/4/2021 7:36:21 AM
[https://youtu.be/-a\_qYfjDXwY](https://youtu.be/-a_qYfjDXwY)
💥 Look forward to Octoparse Black Friday Sale! 💥
Check out the new video tutorial and learn how to scrape Google SERP for SEO!
qmffam
u_Octoparseideas
Octoparseideas
t3_qmffam
https://www.reddit.com/r/u_Octoparseideas/comments/qmffam/how_to_scrape_google_serp_for_seo/
11/4/2021 7:36:21 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Google SERP for SEO
False
1
qmffam
0
32057
3
3
0
0
0
0
0
0
18
58.0645161290323
31
Red
10
Dash Dot Dot
20
No
972
Posted
11/12/2021 8:02:12 AM
Why is food delivery data important? Believe it or not, most people have experienced this: being too tired or too busy to cook for themselves or go out to eat, then reaching out for their smartphones, and later, opening food delivery apps. Thanks to online offers on food, you can now order in whenever you want and enjoy a delicious meal in your comfortable pajamas.
Due to both the growing demand and cultural climate, restaurants that don’t offer food delivery are at risk of falling behind their competitors. To maintain a steady stream of revenue and stay ahead in the industry, the merchants must adapt to these changes in consumer habits.
**Either you are a merchant or a consumer, you can use Octoparse, the no-code web scraping tool to scrape food delivery data for free.**
* For merchants who are new to online food delivery and would like to know more, scraping data can help them in market research.
* For consumers especially foodies and gourmets who are enthusiastic about recommending delicious food, scraping data can help them bulk locate good restaurants and expand the range of recommendations.
Keep reading: [https://www.octoparse.com/blog/scrape-food-delivery-data-from-uber-eats-for-free/?re=](https://www.octoparse.com/blog/scrape-food-delivery-data-from-uber-eats-for-free/?re=)
qs6ibr
u_Octoparseideas
Octoparseideas
t3_qs6ibr
https://www.reddit.com/r/u_Octoparseideas/comments/qs6ibr/scrape_food_delivery_data_from_uber_eats_for_free/
11/12/2021 8:02:12 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape Food Delivery Data from Uber Eats for Free
False
1
qs6ibr
0
32057
3
3
12
5.45454545454545
3
1.36363636363636
0
0
108
49.0909090909091
220
Red
10
Dash Dot Dot
20
No
971
Posted
12/9/2021 7:24:05 AM
https://youtu.be/4a2KY58JHII
rccg9f
u_Octoparseideas
Octoparseideas
t3_rccg9f
https://www.reddit.com/r/u_Octoparseideas/comments/rccg9f/what_can_octoparse_do_to_help_your_business/
12/9/2021 7:24:05 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What can Octoparse do to Help your Business
False
1
rccg9f
0
32057
3
3
Red
10
Dash Dot Dot
20
No
970
Posted
9/23/2021 1:40:42 AM
[https://youtu.be/j-kcLl\_6WZk](https://youtu.be/j-kcLl_6WZk)
Yellowpages is the largest business directory website. We can extract data like business phone numbers, addresses, emails, and anything that appears on the page using Octoparse.
Today, we will continue the eCommerce series and talk about how to get sales leads from the local community from Yellowpages and increase your brand awareness by reaching more local audiences to purchase your services or product.
https://reddit.com/link/ptktba/video/dzuedh6up5p71/player
ptktba
u_Octoparseideas
Octoparseideas
t3_ptktba
https://www.reddit.com/r/u_Octoparseideas/comments/ptktba/how_to_acquire_sales_leads_from_the_local/
9/23/2021 1:40:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Acquire Sales Leads from the Local Community from Yellowpages?
False
1
ptktba
0
32057
3
3
1
1.35135135135135
0
0
0
0
40
54.0540540540541
74
Red
10
Dash Dot Dot
20
No
969
Posted
8/24/2022 3:51:37 AM
Are you looking for a Youtube comment scraper for sentiment analysis? If yes, then you have come to the right place. This article helps you to scrape Youtube comments using the two easiest methods.
[https://www.octoparse.com/blog/youtube-comment-scraper/?utm\_source=2022q3&utm\_medium=youtube-comment-scraper&utm\_campaign=reddit](https://www.octoparse.com/blog/youtube-comment-scraper/?utm_source=2022q3&utm_medium=youtube-comment-scraper&utm_campaign=reddit)
ww9497
Octoparse_ideas
Octoparseideas
t3_ww9497
https://www.reddit.com/r/Octoparse_ideas/comments/ww9497/scrape_youtube_comments_for_sentiment_analysis/
8/24/2022 3:51:37 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape YouTube Comments for Sentiment Analysis
False
1
ww9497
0
32057
3
3
2
2.73972602739726
0
0
0
0
46
63.013698630137
73
Red
10
Dash Dot Dot
20
No
968
Posted
9/13/2021 9:01:17 AM
Travel rules are currently changing with the Covid case curve. With the disease’s Delta variant, the cases are rising. As I am compiling this article, the EU is considering reimposing travel restrictions on U.S. visitors.
Anyway, I have built my Tripadvisor scraper with Octoparse and crawled down the information of destinations that are open to U.S. citizens. Always get prepared for a refreshing trip.
Note: If you are setting out to [these countries](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html), you may want to check if vaccination or quarantine is needed.
By the way, [web scraping](https://en.wikipedia.org/wiki/Web_scraping) is definitely the best way to help us pull down the web data and so we can sift through it and get the most value out of it. I will be showing how it helps me get the travel data.
[Geo Map generated by mapchart.net](https://preview.redd.it/qed95mdmj8n71.png?width=698&format=png&auto=webp&v=enabled&s=c99add0b04792a5b656841703f28f8a68c58a081)
# Web Scraping Travel Data
Do you have any idea about [big data in tourism](https://www.octoparse.com/blog/big-data-in-tourism)?
Business guys in the travel industry are tracking all kinds of data, for example, business data of travel agents and visitors’ behavioral data on all travel-related platforms. They may know your traveling habits better than you. The whole industry is leveraging big data to launch the right product, find the right people to pay for their services.
Web scraping is the tech that makes this possible.
Well as a traveler, I want web scraping travel data to serve my needs — find destinations among the most attractive and get the guides from Tripadvisor for my reference.
**What I am going to do**
* First of all, I need a list of countries to look into.
* Secondly, I will use a web scraping tool, Octoparse, to build a Tripadvisor scraper and crawl these countries’ travel data.
* Finally, I am going to pack my baggage and head for the destination that fits my travel taste most!
# Where Can an American Go
So, where can an American go for travel now?
[This article by CNN](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html) listed the destinations that are open to the U.S.(the list might be updating now and then).
What I wanted to do is to pull all the country names on this web page down into a spreadsheet so I can paste them into Octoparse to get more specific data from Tripadvisor.
[Octoparse: How to get list information on a web page into excel](https://preview.redd.it/rwk0eqipj8n71.png?width=699&format=png&auto=webp&v=enabled&s=475dfc66e71e66eee1560b80190bc46645bce67f)
Octoparse can easily get list information on a web page into excel or CSV.
This is extremely helpful when you want to get a list of URLs or a list of data, which you want to paste and search on another platform, or import into a data analytics software for analysis.
Now that I have got the text list of destinations, I am going to build a TripAdvisor scraper to get specific data about these places.
[Keep reading: build a Tripadvisor scraper](https://www.octoparse.com/blog/tripadvisor-scraper-top-destinations-open-to-the-us-citizens-under-covid/?re=)
pnbqds
Octoparse_ideas
Octoparseideas
t3_pnbqds
https://www.reddit.com/r/Octoparse_ideas/comments/pnbqds/tripadvisor_scraper_top_destinations_open_to_the/
9/13/2021 9:01:17 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Tripadvisor Scraper: Top Destinations Open to the U.S. Citizens under Covid
False
1
pnbqds
0
32057
3
3
11
1.94003527336861
0
0
0
0
274
48.3245149911817
567
Red
10
Dash Dot Dot
20
No
967
Posted
6/24/2022 2:22:32 AM
Data crawling is used for data extraction and refers to collecting data from either the world wide web or from any document or file. Here, I’d like to introduce 3 ways to crawl data from a website, and the pros and cons of each approach.
[https://www.octoparse.com/blog/how-to-crawl-data-from-a-website/?utm\_source=sale2022&utm\_medium=crawldatafromawebsite&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-crawl-data-from-a-website/?utm_source=sale2022&utm_medium=crawldatafromawebsite&utm_campaign=reddit)
vjdjlx
Octoparse_ideas
Octoparseideas
t3_vjdjlx
https://www.reddit.com/r/Octoparse_ideas/comments/vjdjlx/how_to_crawl_data_from_a_website/
6/24/2022 2:22:32 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Crawl Data from a Website
False
1
vjdjlx
0
32057
3
3
1
1.12359550561798
1
1.12359550561798
0
0
45
50.561797752809
89
Red
10
Dash Dot Dot
20
No
966
Posted
11/16/2021 3:54:17 AM
Know everything you need as Octoparse beginners:
[https://helpcenter.octoparse.com/hc/en-us/articles/4409179533465-Join-our-Beginner-Academy-Learn-web-scraping-in-the-community/?re=](https://helpcenter.octoparse.com/hc/en-us/articles/4409179533465-Join-our-Beginner-Academy-Learn-web-scraping-in-the-community/?re=)
If you are new to web scraping with Octoparse and have found it a bit tricky to learn alone, we are holding the Beginner Academy as the finale of 2021.
In the community, you will have access to onboarding courses, exercises, and a community group where you can discuss with fellow learners and submit questions to our professional support.
quyrzq
Octoparse_ideas
Octoparseideas
t3_quyrzq
https://www.reddit.com/r/Octoparse_ideas/comments/quyrzq/join_our_beginner_academy_learn_web_scraping_in/
11/16/2021 3:54:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Join our Beginner Academy: Learn web scraping in the community
False
1
quyrzq
0
32057
3
3
1
0.943396226415094
1
0.943396226415094
0
0
56
52.8301886792453
106
Red
10
Dash Dot Dot
20
No
965
Posted
10/25/2021 4:09:36 AM
[View Poll](https://www.reddit.com/poll/qf8otm)
qf8otm
Octoparse_ideas
Octoparseideas
t3_qf8otm
https://www.reddit.com/r/Octoparse_ideas/comments/qf8otm/whats_your_favorite_new_feature_in_octoparse_842/
10/25/2021 4:09:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What's your favorite new feature in Octoparse 8.4.2 version?
False
1
qf8otm
0
32057
3
3
Red
10
Dash Dot Dot
20
No
964
Posted
11/4/2021 7:36:57 AM
[https://youtu.be/-a\_qYfjDXwY](https://youtu.be/-a_qYfjDXwY)
💥 Look forward to Octoparse Black Friday Sale! 💥
Check out the new video tutorial and learn how to scrape Google SERP for SEO!
qmffj2
Octoparse_ideas
Octoparseideas
t3_qmffj2
https://www.reddit.com/r/Octoparse_ideas/comments/qmffj2/how_to_scrape_google_serp_for_seo/
11/4/2021 7:36:57 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Google SERP for SEO
False
1
qmffj2
0
32057
3
3
0
0
0
0
0
0
18
58.0645161290323
31
Red
10
Dash Dot Dot
20
No
963
Posted
6/22/2022 9:15:57 AM
Here is a list of the 30 most popular web scraping software. I just put them together under the umbrella of software, while they range from open-source libraries and browser extensions to desktop software and more.
[https://www.octoparse.com/blog/top-30-free-web-scraping-software/?utm\_source=sale2022&utm\_medium=top30freewebscrapingsoftware&utm\_campaign=reddit](https://www.octoparse.com/blog/top-30-free-web-scraping-software/?utm_source=sale2022&utm_medium=top30freewebscrapingsoftware&utm_campaign=reddit)
vi0ycw
Octoparse_ideas
Octoparseideas
t3_vi0ycw
https://www.reddit.com/r/Octoparse_ideas/comments/vi0ycw/top_30_free_web_scraping_software_in_2022/
6/22/2022 9:15:57 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 30 Free Web Scraping Software in 2022
False
1
vi0ycw
0
32057
3
3
5
6.41025641025641
0
0
0
0
46
58.974358974359
78
Red
10
Dash Dot Dot
20
No
962
Posted
6/8/2021 11:25:44 AM
[removed]
nv1snc
u_Octoparseideas
Octoparseideas
t3_nv1snc
https://www.reddit.com/r/u_Octoparseideas/comments/nv1snc/build_a_reddit_image_scraper_without_coding/
6/8/2021 11:25:44 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Build a Reddit Image Scraper without Coding
False
1
nv1snc
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
961
Posted
11/22/2021 9:11:42 AM
[http://www.dataextraction.io/?p=1163/?re=](http://www.dataextraction.io/?p=1163/?re=)
As a hard-core fan of various kinds of music, I am always eager to know what’s new and popular on Billboard. Since I also play music and write music reviews myself, I need to analyze the latest hits on the website. But it can be time-consuming to copy and paste the list of songs manually, what should I do to speed up the process?
Thanks to Octoparse, I can complete the list crawling task with the help of the Billboard music scraper created by myself. In this article, I will demonstrate how [web scraping](https://en.wikipedia.org/wiki/Web_scraping) works to crawl the hot 100 songs on Billboard. The same way is also feasible for scraping listings from websites of different industries.
# Build A Billboard Music Scraper in 3 Steps
Although I know nothing about coding, I can use [Octoparse](https://www.octoparse.com/download/windows) to set up my own Billboard music scraper in only 3 steps. What I intend to do is to extract the information about the hot 100 songs. Normally, I ought to visit [Billboard.com](https://www.billboard.com/), find the “[Hot 100](https://www.billboard.com/charts/hot-100)” section, enter the section, and start to copy and paste the data I need. Now, the whole process can be done with a scraping bot.
# Step 1 Enter the URL of Billboard and Find the Music Listings You Would Like to Crawl
After launching the software and logging in, I need to enter the URL of Billboard, and click on **“Start”**.
https://preview.redd.it/jyon9t4h44181.png?width=1400&format=png&auto=webp&v=enabled&s=ce03fb721c1c526f31b2b099f564bbd13016f4fe
After the webpage finishes loading in Octoparse’s built-in browser, I should click on the **“Hot 100”** section under the scraping mode, and select **“Click URL”** on the **“Tips”** panel.
https://preview.redd.it/i5q0sryi44181.png?width=1400&format=png&auto=webp&v=enabled&s=4f68269edbc11a8dd761523e8f58032d258e4888
# Step 2 Generate the Workflow of Your Billboard Music Scraper
Now, I can click on **“Auto-detect web data”** to find data on the page automatically.
https://preview.redd.it/phhv70ok44181.png?width=1400&format=png&auto=webp&v=enabled&s=dd83c50cd8255d5d020a1394f19ce6e2b9b28297
Then, I need to switch auto-detect results to the **“hot 100”** chart. Since the whole 100 songs are on the same page, and I can scrape while scrolling, it is not necessary to **“Paginate to scrape more pages”**. I just uncheck the box. After clicking on **“Create workflow”**, I manage to make the Billboard music scraper by myself.
https://preview.redd.it/cfm3ac9z44181.png?width=1400&format=png&auto=webp&v=enabled&s=237003885669094a930045b3b40b2d8a2b9b7103
# Step 3 Run the Task You Build and Extract the Data
After saving the task, I am ready to run the task and extract the data I need. Hit **“Run”**, and Octoparse will start to work for me. Since I’m a premium user, I can choose to extract the data either on my local device or on the cloud.
https://preview.redd.it/qqze4ve254181.png?width=1160&format=png&auto=webp&v=enabled&s=2b873d2cdc1435d4e4c129e3ad3158ef1a712544
Free users can only extract the data locally. [Cloud extraction](https://helpcenter.octoparse.com/hc/en-us/articles/360018047092-What-is-Cloud-Extraction-) is available for those who go premium, which is more convenient that data can be saved to the cloud for easy access. Besides, the task can be scheduled to run at any time.
I decide to run the task on my device this time. Then, tada! The hot 100 songs data is extracted in seconds.
https://preview.redd.it/966985b454181.png?width=1400&format=png&auto=webp&v=enabled&s=b6b4009d8aeed62dd3b88a452ab61c9058f471da
# List Crawling Examples: What Kinds of Listings Are Most Frequently Scraped?
The process of scraping Billboard hot 100 songs is impressive. The same way can be applied to list crawling in a variety of industries. Let’s look at the most frequently scraped listings below.
# Real Estate Listings
It is the most important thing to find a suitable house to rent when we move to a new city. Skimming through the real estate websites to check the house listings one by one is not a delightful experience. Gathering all the houses available for renting through web scraping can get rid of the tedious and repetitive manual work. For real estate agents, scraping the listings is also an efficient way to help satisfy their customers’ demands.
# E-Commerce Product Listings
Product listings on e-commerce platforms are abundant resources for both merchants and customers. List crawling is so popular in the e-commerce industry that almost all store owners [scrape product listings](https://www.octoparse.com/blog/the-easiest-way-to-extract-data-from-e-commerce-websites) for price monitoring and market research. For e-commerce startups, product scraping is also beneficial for selecting potential products to sell and optimizing the business strategy.
# Job Listings
Job seekers, especially fresh graduates, are frequent visitors to job search websites like Indeed, LinkedIn, Glassdoor, etc. Thousands of jobs across several niches are posted on these websites every day. For job searchers, [scraping job listings](http://www.dataextraction.io/?p=1125) can speed up the process of finding their dream job.
# Travel Data and Hotel Listings
Although Covid-19 is still haunting, people who love traveling are always ready for a refreshing trip. By [scraping travel data and hotel listings](https://www.octoparse.com/blog/tripadvisor-scraper-top-destinations-open-to-the-us-citizens-under-covid), travelers can pull down the needed web data and find destinations open to them currently. For travel agencies, list crawling in this industry offers a chance to track the tourists’ behaviors and understand their habits better.
# Conclusion
The listings mentioned above can be scraped using Octoparse through both [built-in templates](https://helpcenter.octoparse.com/hc/en-us/articles/900003158843-Task-Templates-Version-8-) and self-created crawlers. With the no-coding web scraping tool, list crawling in all industries is ready to go. Enjoy your journey with Octoparse, and feel free to contact us at [support@octoparse.com](mailto:support@octoparse.com) if there are any problems.
qzhf2t
Octoparse_ideas
Octoparseideas
t3_qzhf2t
https://www.reddit.com/r/Octoparse_ideas/comments/qzhf2t/list_crawling_build_a_billboard_music_scraper_for/
11/22/2021 9:11:42 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
List Crawling: Build A Billboard Music Scraper for Free
False
1
qzhf2t
0
32057
3
3
42
4.48239060832444
9
0.96051227321238
0
0
464
49.5197438633938
937
Red
10
Dash Dot Dot
20
No
960
Posted
10/8/2021 7:18:36 AM
Participate the event on Twitter and win FREE gifts: [https://hubs.la/H0Z13tf0](https://hubs.la/H0Z13tf0)
https://preview.redd.it/iu4t1j4eg6s71.png?width=1600&format=png&auto=webp&v=enabled&s=86c15488f53b75c61bab5b2ebffe76aba067be5b
🌟Octoparsing with Zapier🌟
3 steps to win gifts worth $270:
1. Connect Octoparse with Zapier & Export your cloud data to any app.
2. Take a screenshot showing the successful data export.
3. Share your screenshot & feedback quoting the event tweet.
Join us and play!
q3spil
u_Octoparseideas
Octoparseideas
t3_q3spil
https://www.reddit.com/r/u_Octoparseideas/comments/q3spil/participate_in_the_octoparsing_with_zapier_event/
10/8/2021 7:18:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Participate in the "Octoparsing with Zapier" event to win FREE gifts
False
1
q3spil
0
32057
3
3
Red
10
Dash Dot Dot
20
No
959
Commented
10/8/2021 7:21:58 AM
\- Gold: Amazon Gift Card $20 + Custom Crawler Coupon $250
\- Silver: Amazon Gift Card $10
\- Bronze: Amazon Gift Card $5
Find the tutorial here: https://helpcenter.octoparse.com/hc/en-us/articles/4406338353689-How-to-Connect-Octoparse-with-Zapier
hftujgy
u_Octoparseideas
Octoparseideas
t1_hftujgy
https://www.reddit.com/r/u_Octoparseideas/comments/q3spil/participate_in_the_octoparsing_with_zapier_event/hftujgy/
10/8/2021 7:21:58 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
q3spil
t3_q3spil
q3spil
0
q3spil
True
False
False
0
32057
3
3
Red
10
Dash Dot Dot
20
No
958
Posted
3/22/2022 3:22:40 AM
Imagine your company has just developed a new type of shampoo. Wouldn’t it be great to have a list of all the beauty salon shops near you as well as those across the entire nation? Wouldn’t it be even better if you could conveniently locate their contact details, addresses, email addresses, phone numbers, and Facebook page?
This is what lead generation does. In this article, we are going to talk about lead generation and how to harvest sales leads (email addresses) from websites.
[https://www.octoparse.com/blog/email-extractor-geathering-sales-leads-in-minutes/?re=](https://www.octoparse.com/blog/email-extractor-geathering-sales-leads-in-minutes/?re=)
tjtph7
Octoparse_ideas
Octoparseideas
t3_tjtph7
https://www.reddit.com/r/Octoparse_ideas/comments/tjtph7/how_to_use_email_extractors_to_collect_sales/
3/22/2022 3:22:40 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Use Email Extractors to Collect Sales Leads in Minutes
False
1
tjtph7
0
32057
3
3
10
9.00900900900901
0
0
0
0
49
44.1441441441441
111
Red
10
Dash Dot Dot
20
No
957
Posted
12/8/2021 1:55:14 AM
https://youtu.be/dxKTTKlBTQo
rbfkf0
Octoparse_ideas
Octoparseideas
t3_rbfkf0
https://www.reddit.com/r/Octoparse_ideas/comments/rbfkf0/how_to_scrape_facebook_account_with_octoparse/
12/8/2021 1:55:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Facebook account with Octoparse
False
1
rbfkf0
0
32057
3
3
Red
10
Dash Dot Dot
20
No
956
Posted
2/15/2022 8:15:24 AM
*Originally published as* [https://www.octoparse.com/blog/movie-crawler-scraping-100-000plus-movie-information/?re=](https://www.octoparse.com/blog/movie-crawler-scraping-100-000plus-movie-information/?re=) *on February 15th, 2022.*
Are you looking to scrape movie data from websites like IMDb, Flixster, and Rotten tomatoes? I will introduce an easy-to-use movie scraper that you can gather all the on-page data without any coding skills.
https://preview.redd.it/yqh0mtjegyh81.png?width=1080&format=png&auto=webp&v=enabled&s=610a475f31f2fb8dd836c9054a0d4f29b301ace4
# What you can get with a movie scraper
This is a movie scraper that helps scrape data like:
* Movie name
* Year
* Category
* Ratings
* Introduction
* Cast
* Cover image (URL)
And you may scrape other data such as movie reviews, or TV show information as long as they are there on the web page. You can customize your scraper to get whatever data you want once you get a hang of it.
# Getting Started
To help you fulfill data gathering, this article will lead you through a web scraping case to scrape the information from the IMDb movie list — [IMDb Top 250 Movies](https://www.imdb.com/chart/top/?ref_=nv_mv_250).
We will start from the basic information: movie name, year, featured page URLs, cover image, and ratings.
(When you master the technique, you can use [the advanced search](https://www.imdb.com/search/title/?start=1&explore=title_type,genres&ref_=adv_nxt) to filter out the movies you are interested in and get the list of data all down.)
## Prerequisites:
* [Download Octoparse](https://www.octoparse.com/download) (Mac & Windows versions available, here is [some instructions](https://helpcenter.octoparse.com/hc/en-us/articles/4406078522265-Download-Installation-Login-Windows-Mac-))
* Target URL (in this case: [https://www.imdb.com/chart/top/?ref\_=nv\_mv\_250)](https://www.imdb.com/chart/top/?ref_=nv_mv_250))
Yes, we will be using this link to scrape the top 250 movies on IMDb:
[https://www.imdb.com/chart/top/?ref\_=nv\_mv\_250](https://www.imdb.com/chart/top/?ref_=nv_mv_250)
If you want to learn some basics of this movie scraper first, this is a bit of intro: [basic logic of using Octoparse](https://helpcenter.octoparse.com/hc/en-us/articles/900000659063-Lesson-0-Octoparse-Basics) (“The interface” Section recommended).
If you don’t want to bother reading anything, then rest assured, this guide is totally easy to follow. In fact, there won’t be more than a few steps.
# Scraping Top 250 Movies in 30 Seconds
This is a step-by-step guide to get IMDb movie data with Octoparse’s auto-detection mode.
A quick view over the guide:
* Step 1: Open the target website in the Octoparse built-in browser.
* Step 2: Click the “Auto-detect web page data”.
* Step 3: Select the dataset you want to scrape and click “Create workflow” to confirm.
* Step 4: As the workflow is created, click “Run” to let it run.
* Step 5: Export the data for offline use.
Let’s dive right in!
## Step 1: Open the target website in the Octoparse built-in browser.
On the Home page, simply enter the URL on the search bar and click “Enter”. The built-in browser will start to render the page.
https://preview.redd.it/wlskw39pgyh81.png?width=800&format=png&auto=webp&v=enabled&s=a4ab6069edd4632a34441444d7968c31d0afa36b
## Step 2: Click the “Auto-detect web page data”.
As the URL is successfully rendered in the Octoparse built-in browser, you will notice a yellow Tips Panel. There are options suggesting what you shall take for the next step.
https://preview.redd.it/krkzewxrgyh81.png?width=500&format=png&auto=webp&v=enabled&s=ef492c472380c3484ebffb563414f8580ddd62a5
At this point, click the option “Auto-detect web page data” and Octoparse will scan the page thoroughly.
## Step 3: Select the dataset you want to scrape and click “Create workflow” to confirm.
As the auto-detection is finished, Octoparse will tell you what it has found out on the page that is very likely what you are looking for. And there may be more than one data result for selection.
Look down on the interface. What in the preview box now is the number one recommendation data result. Woohoo, this is a perfect form with the exact data we are looking to scrape.
https://preview.redd.it/6fsxq2ywgyh81.png?width=1024&format=png&auto=webp&v=enabled&s=4978cb217306dbfcc5957d5376a92c1444eec5b1
If you want to switch to another result to check what Octoparse is offering you, click “Switch auto-detect results” to meet your curiosity. Once you make your decision, click “Create workflow” to confirm your pick.
## Step 4: As the workflow is created, click “Run” to let it run.
After clicking “Create workflow”, you will see some changes on the interface, and on your right side, there appears a so-called workflow of your movie scraper.
That’s about some commands and rules you set for the scraper to run. In this case, for auto-detection, Octoparse set it for you with its intelligent algorithm. You may learn how to [build a workflow yourself](https://helpcenter.octoparse.com/hc/en-us/categories/4406059450265-Version-8-4) in order to create a more customized scraper later.
Anyway, we have got what we want. And now we click on the upper right side the small little blue button “Run” to start the scraper. If you start free with Octoparse, choose to run on your local device.
***Tips:*** *Running in the cloud is faster and can avoid being blocked, click here to learn more about the advantages of* [*cloud scraping*](https://www.octoparse.com/blog/cloud-extraction-work-247-with-3-10-times-faster)*.*
I still got my data in 30 seconds. Web scraping is so amazing!
https://preview.redd.it/tqtj23f1hyh81.png?width=800&format=png&auto=webp&v=enabled&s=fb070052dd00f78a7bdbea3c0792152c39247e96
## Step 5: Export the data for your offline use.
You must have witnessed how fast a web scraper can be to copy data from the web. As the data is well arranged and downloaded, you are available to export it in formats like Excel, CSV, HTML, or JSON.
We made it! The smart IMDb movie scraper. In the same manner, we can make a Flixster movie scraper, Rotten tomatoes movie review scraper, and Netflix TV series scraper, whatever you want.
***Tips:*** *If you try something new and the auto-detection didn’t make you happy, feel free to contact us at support@octoparse.com. Our professional support will help you out.*
# Final Words
With the above steps, I suppose, everyone, including those who have no programming background can easily build a movie crawler with Octoparse and get more than 100,000 lines of the movie information.
Apart from the data, the more important is about the skill you learned, which would be extremely useful if you want some data for market research, analysis, and many other things.
Artículo en español: [Scraping más de 100,000 información de películas](http://www.octoparse.es/blog/scraping-100-000-infor-de-pel%C3%ADculas)
También puede leer artículos de web scraping en [El Website Oficial](http://www.octoparse.es/)
ssy1go
Octoparse_ideas
Octoparseideas
t3_ssy1go
https://www.reddit.com/r/Octoparse_ideas/comments/ssy1go/easytouse_movie_scraper_scraping_movies_from_imdb/
2/15/2022 8:15:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Easy-to-use Movie Scraper | Scraping Movies from IMDb, Flixster, etc.
False
1
ssy1go
0
32057
3
3
41
3.68705035971223
8
0.719424460431655
0
0
555
49.910071942446
1112
Red
10
Dash Dot Dot
20
No
955
Posted
3/1/2022 6:30:23 AM
Check out this video introducing new features and improvements in Octoparse 8.5 by quick demo. Complete video👉[https://youtu.be/nVycXF3np1o](https://youtu.be/nVycXF3np1o)
https://reddit.com/link/t41fuo/video/58bewj31vpk81/player
t41fuo
Octoparse_ideas
Octoparseideas
t3_t41fuo
https://www.reddit.com/r/Octoparse_ideas/comments/t41fuo/what_are_the_highlights_in_octoparse_85_version/
3/1/2022 6:30:23 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What are the highlights in Octoparse 8.5 version?
False
1
t41fuo
0
32057
3
3
1
3.84615384615385
0
0
0
0
13
50
26
Red
10
Dash Dot Dot
20
No
954
Posted
8/31/2022 7:16:26 AM
This article will serve as a guide to give you insights into the Data Extraction procedure, its types, and its perks. Additionally, we will talk about the top 10 data extraction tools to watch out for in 2022.
[https://www.octoparse.com/blog/top-data-extraction-tools/?utm\_source=2022q3&utm\_medium=top-data-extraction-tools&utm\_campaign=reddit](https://www.octoparse.com/blog/top-data-extraction-tools/?utm_source=2022q3&utm_medium=top-data-extraction-tools&utm_campaign=reddit)
x2717t
u_Octoparseideas
Octoparseideas
t3_x2717t
https://www.reddit.com/r/u_Octoparseideas/comments/x2717t/top_10_data_extraction_tools_in_2022/
8/31/2022 7:16:26 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 10 Data Extraction Tools in 2022
False
1
x2717t
0
32057
3
3
5
6.17283950617284
0
0
0
0
47
58.0246913580247
81
Red
10
Dash Dot Dot
20
No
953
Posted
6/25/2021 7:27:54 AM
[removed]
o7it3n
u_Octoparseideas
Octoparseideas
t3_o7it3n
https://www.reddit.com/r/u_Octoparseideas/comments/o7it3n/scrape_yahoo_finance_market_data_for_free/
6/25/2021 7:27:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape Yahoo Finance Market Data for Free
False
1
o7it3n
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
952
Posted
11/29/2021 7:39:50 AM
[http://www.dataextraction.io/?p=1167/?re=](http://www.dataextraction.io/?p=1167/?re=)
Content creators are always stressed out to come up with new ideas. Regular updates are essential. While if you have been focused on a niche for some time, it’s natural to become numb on the topic.
You need more perspectives and new triggers to open up your mind. A web scraping tool and a list of valuable sources could be extremely helpful in this case.
# Who Will Benefit from This Article
* Youtube creators looking for new topics to make videos
* Blog writers seeking new ideas to write
* Content marketers wanting to source ideas efficiently
* Inbound marketing professionals looking for SEO-friendly topics
Modern people are flooded with opinions and content every day. How do we generate ideas from them?
Reading is not practical. We need a structured way to spy, analyze and extract the values out of them. A web scraping tool can help build the database where writers/creators could filter through and get inspired.
On top of all the places where you can source ideas, I may also present how web scraping can improve the process.
# 7 Ways to Curate Content Ideas
1. Youtube Channel Crawler
2. Study Google SERP
3. Listen to social media discussions
4. Spy on Competitors’ Blog
5. Idea generator
6. Keyword planning tools (Semrush/Ahrefs)
7. Look back at what you have created
## Youtube Channel Crawler
A few months ago, I shared a post about how to use a [Youtube channel crawler](https://www.octoparse.com/blog/youtube-channel-crawler) to review your channel and spy on competitors’.
[https://youtu.be/oJtawA\_0bHI](https://youtu.be/oJtawA_0bHI)
In this way, you can grab topic data (titles, tags, description) and performance data (views, tags, likes) of published videos. Filtering through the data, you will know what you shall focus on: the well-performing topics that your channel has never touched.
Always keep up with the enemy.
A topic performs well on another channel and you haven’t made any out of it yet? That’s potential. Spot the gap and fill it with a video better than any of the existing.
Besides, comments under the video left by the viewers are important signals too, through which you may evaluate how intriguing the topic is to your target audience and learn what you can improve to rank over your enemy.
## Study Google SERP
A Google SERP is helpful if you want to generate SEO-performing content.
* Top-ranking articles on the result page tell us what people are looking for. Always match your content with the need of your audience.
* Title and description (and the query you are searching for) shed light on what topics are welcome and what words or phrases can be utilized as SEO-friendly keywords.
[Google SERP Crawler Works for SEO Benefits](https://preview.redd.it/ua94cluenh281.png?width=1024&format=png&auto=webp&v=enabled&s=522923fdc4144ec3ed36ef8be0cb5aed7f6cf9a8)
Use your general idea as the query to search on Google and the search results help you niche down to a more targeted scope.
## Listen to Social Media Discussions
Posts discussing how to source content ideas won’t miss out on social media platforms, Youtube, Instagram, Reddit, just to name a few. That’s true, they are valuable and resourceful. Conversations reveal what people think and care about.
Social media and Q&A platforms to source ideas:
* Twitter
* Instagram
* Facebook
* Youtube
* Medium
* Reddit
* Quora
* Niche communities (eg. Stackoverflow for techies)
However, the key is to find a place where your target audience gathers. Use hashtags, groups, subreddits, channels, and direct search to filter out the insignificant noises and focus on the voices of your potential audience.
## Spy on Competitors’ Blog
If you are a Youtuber, keep a close eye on your competitor’s channel. Same for a blogger. And do not limit yourself to “rival competitors”, any websites or channels that share a group of the audience are worth attention. They help us jump out of the box and see what extra values we can offer to our audience.
## Idea Generator
There are web-based tools designed for creators to find related topics. Most of them use algorithms to put recurring words into certain patterns and generate questions or titles for writers to write on. The titles/ideas offered may not be the perfect fit to use directly, while it can broaden your mind sometimes.
\>>Question generator: [AnswerthePublic](https://answerthepublic.com/)
\>>Title generator: [HubspotTopic](https://www.hubspot.com/blog-topic-generator)
## Keywords Planning Tool
The keyword is so SEO-oriented that it is always mistaken as magic for high traffic. Instead, keywords and their popularity do not guarantee traffic but work as a signal to tell you what people are seeking. Hence, a keywords planning tool is an instrument for content marketers to find out customer needs.
A good content idea is not necessarily new, original, or fancy, but it must meet the real needs of your audience.
[SEMrush](https://www.semrush.com/lp/sem/en) and [Ahrefs](https://ahrefs.com/) are good tools to explore the keywords, find out what people are looking for on the Internet. There are keyword tools tailored for Youtubers like [Socialblade](https://socialblade.com/), [KeywordTool](https://keywordtool.io/youtube).
## Look Back at What You Have Created
A super influencer with 1 million followers may not care much about SEO stuff. The new release will be dispensed to his/her followers anyway. Social sharing plays a big role and a good ranking is just a result.
A growing channel is different. SEO matters a lot. Analyzing what you have published and how they performed is important to understand who your audience is and what kind of content attracts them.
If you are running a blog, make sure to set up [Google Analytics](https://analytics.google.com/analytics/web/) and [Google Search Console](https://search.google.com/search-console/about) to monitor the performance of your posts and make adjustments accordingly.
# Conclusions
You may source content ideas as an observer, browsing the existing posts, reviews, blogs, and videos. Make sure the quality and quantity of your input do matter. On the other hand, creating opportunities to have direct interaction with your audience like holding an interview, starting a live conversation, may inspire you in a new way.
r4qkcu
Octoparse_ideas
Octoparseideas
t3_r4qkcu
https://www.reddit.com/r/Octoparse_ideas/comments/r4qkcu/7_ways_to_find_content_ideas_from_the_web_and/
11/29/2021 7:39:50 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
7 Ways to Find Content Ideas from the Web and Create Like a Pro
False
1
r4qkcu
0
32057
3
3
36
3.38983050847458
9
0.847457627118644
0
0
530
49.9058380414313
1062
Red
10
Dash Dot Dot
20
No
951
Posted
11/23/2022 3:41:10 AM
Black Friday is around the corner, and deals are rolling. You might notice that many retailers have been promoting their early Black Friday deals since the end of October. Product prices are fast-changing at this time.
In this article, we’ll show you how to collect products’ prices on Amazon with Octoparse. As a no-code data scraper, Octoparse users can collect data from webpages with only a few clicks regardless of coding skills.
[https://www.octoparse.com/blog/amazon-black-friday-price-strategy?utm\_source=reddit&utm\_medium=social&utm\_campaign=article-promotion](https://www.octoparse.com/blog/amazon-black-friday-price-strategy?utm_source=reddit&utm_medium=social&utm_campaign=article-promotion)
z2ebny
Octoparse_ideas
Octoparseideas
t3_z2ebny
https://www.reddit.com/r/Octoparse_ideas/comments/z2ebny/how_to_monitor_competitors_prices_on_amazon/
11/23/2022 3:41:10 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Monitor Competitors' Prices on Amazon During Black Friday
False
1
z2ebny
0
32057
3
3
1
0.862068965517241
0
0
0
0
72
62.0689655172414
116
Red
10
Dash Dot Dot
20
No
950
Posted
4/6/2022 8:45:14 AM
With more and more information available on the website, it is nearly impossible for individuals to find what they need without some help sorting through the data. Google ranking systems are designed to complete the task, which is sorting through hundreds of billions of web pages in the search index to find the most relevant, useful results in a fraction of a second, and presenting them in a way that helps you find what you are looking for. These ranking systems are made up of not one, but a whole series of algorithms.
To give you the most useful information, search algorithms look at many factors, including the words of your query, relevance, and usability of pages, the expertise of sources, and your location and settings. The weight applied to each factor varies depending on the nature of your query. SEO (Search Engine Optimization) is the process of affecting the visibility of a website or a web page in a web search engine’s unpaid results. In fact, this is the free way to improve your Google ranking and attract more traffic.
With more and more information available on the website, it is nearly impossible for individuals to find what they need without some help sorting through the data. Google ranking systems are designed to complete the task, which is sorting through hundreds of billions of web pages in the search index to find the most relevant, useful results in a fraction of a second, and presenting them in a way that helps you find what you are looking for. These ranking systems are made up of not one, but a whole series of algorithms.
To give you the most useful information, search algorithms look at many factors, including the words of your query, relevance, and usability of pages, the expertise of sources, and your location and settings. The weight applied to each factor varies depending on the nature of your query. SEO (Search Engine Optimization) is the process of affecting the visibility of a website or a web page in a web search engine’s unpaid results. In fact, this is the free way to improve your Google ranking and attract more traffic.
With more and more information available on the website, it is nearly impossible for individuals to find what they need without some help sorting through the data. Google ranking systems are designed to complete the task, which is sorting through hundreds of billions of web pages in the search index to find the most relevant, useful results in a fraction of a second, and presenting them in a way that helps you find what you are looking for. These ranking systems are made up of not one, but a whole series of algorithms.
To give you the most useful information, search algorithms look at many factors, including the words of your query, relevance, and usability of pages, the expertise of sources, and your location and settings. The weight applied to each factor varies depending on the nature of your query. SEO (Search Engine Optimization) is the process of affecting the visibility of a website or a web page in a web search engine’s unpaid results. In fact, this is the free way to improve your Google ranking and attract more traffic.
https://preview.redd.it/05tfmbvqfvr81.png?width=1080&format=png&auto=webp&v=enabled&s=fcc20d239d9601b2a06db4a00b5b7fabaf28030e
# SEO & Google Ranking Improvement
A study by Infront Webworks showed that the first page of Google receives 95% of web traffic, with subsequent pages receiving 5% or less of total traffic. So for most people, especially those who want to start their business with limited funds, SEO (Search Engine Optimization) is a good way to improve Google ranking to display their websites and attract more people to the websites at a relatively little cost.
Contrary to what some ‘experts’ would have you believe, SEO does not have cut-and-dry rules. There is no plug-and-play method to SEO success. This is mostly because search engines are always updating their algorithms. And with every new algorithm update comes new guidelines to follow if you want to rank high.
SEO is a big thing with many factors that would aspect the Google ranking, like:
* **On-page factors:** keyword in the title tag, keyword in H1 tag, description, the length of content, etc.
* **Site factors:** sitemap, domain trust, server location, etc.
* **Off-page factors:** the number of linking domains, domain authority of linking page, the authority of linking domain, etc.
* **Domain factors:** domain registration length, domain history, etc.
(Note: For more details, you could refer to [30 Most Important Google Ranking Factors A Beginner Should Know](https://unamo.com/blog/seo/30-important-google-ranking-factors-beginner-know))
**Most of these factors could be researched with web scraping tools in a free way** (refer to [Top 30 Free Web Scraping Software](https://www.octoparse.com/blog/top-30-free-web-scraping-software/) for more information). And with enough information, you could develop a better strategy to improve your Google ranking.
So in this post, I would only focus on keyword research, backlink research, and LCP improvement to show you how to identify projected traffic and ultimately how to determine the value of that ranking in a free and easy way if you don’t have any ideas.
# Keyword Research
Keywords are the specific words or phrases that someone enters into a search box on a site like Google to find information.
To help your business show up higher in the search results, it is important to research and discover what your customers and prospects are searching for and then create content that targets those terms.
I bet you would say, “Oh, it’s easy. You know, there are plenty of keyword research tools, [Keyword Planner](https://www.google.com.hk/intl/en/adwords/?channel=ha-ef&sourceid=awo&subid=hk-en-ha-rhef-skhp0~200331725223&gclid=Cj0KEQjwz_TMBRD0jY-RusGilOYBEiQAN-TuFKvM3258DNURsErkKrAwxzhqdW7kGeSDwae1nDWiZJwaAsHq8P8HAQ&dclid=CI_As8H67tUCFcWlvQodl8wADQ), [Buzzsumo](https://app.buzzsumo.com/research/most-shared), for example. They all could help me find the most valuable keywords to target with SEO.”
Yes, it is right. But how could you judge the value of keywords? How do you know that you get the right kind of visitors?
The answer is to research your market’s keyword demand, predict shifts in demand, and produce content that web searchers are actively seeking. The tools mentioned above would only show us the keywords that visitors often type into search engines. However, they cannot show us directly how valuable it is to receive traffic from those searches. To understand the value of a keyword, we need to understand our own websites, make some hypotheses, test, and repeat — the classic web marketing formula. Here I would show you how it works.
For example, suppose you have chosen some targeted keywords and produced some content before, now you need to measure the effects. That’s to say, when searchers use these relevant keywords searching on Google, they would find your website and come to it. That’s why you need to know your ranking first.
Right out from building a strategy, you need data on keywords, competitor rankings, consumer content preferences, link building data, etc. When you start to implement your strategy, you need to collect data to measure how well your efforts to use SEO for growth are yielding results. And if they aren’t working, you need data to find out why.
I will take [www.octoparse.com](http://www.octoparse.com/) for example to illustrate that.
How could I know the ranking of the Octoparse domain when searching the two relevant keywords “free [web scraping](https://www.octoparse.com/WebScraping) tool” and “free web scraping service”? And how could I know the other detailed information ranking before Octoparse so that I could know better about the value of the searched keywords?
The answer is the web scraping tool. One of the major benefits of web scraping is that it allows you to collect data in large volumes. While the data of keywords count as small data, it is still large enough to cause you quite some headaches if you choose to collect it manually.
With [the web scraping tool Octoparse](https:/...
txhln9
u_Octoparseideas
Octoparseideas
t3_txhln9
https://www.reddit.com/r/u_Octoparseideas/comments/txhln9/how_to_improve_google_ranking_for_free_in_2022/
4/6/2022 8:45:14 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Improve Google Ranking for Free in 2022
False
1
txhln9
0
32057
3
3
48
3.52681851579721
5
0.367376928728876
0
0
662
48.6407053637032
1361
Red
10
Dash Dot Dot
20
No
949
Posted
6/20/2022 7:38:51 AM
We all know how hard it is to build an email sales list from scratch, so we need help from email scraping tools. Email scraping can help you collect email addresses shown publicly using a bot. In this article, I profiled a list of the best 11 email scraping tools for sales prospecting. Let's take a look.
[https://www.octoparse.com/blog/best-email-scraping-tools-for-sales-prospecting-in-2019/?utm\_source=sale2022&utm\_medium=bestemailscrapingtools&utm\_campaign=reddit](https://www.octoparse.com/blog/best-email-scraping-tools-for-sales-prospecting-in-2019/?utm_source=sale2022&utm_medium=bestemailscrapingtools&utm_campaign=reddit)
vgfwaq
Octoparse_ideas
Octoparseideas
t3_vgfwaq
https://www.reddit.com/r/Octoparse_ideas/comments/vgfwaq/11_best_email_scraping_tools_for_sales/
6/20/2022 7:38:51 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
11 Best Email Scraping Tools for Sales Prospecting in 2022
False
1
vgfwaq
0
32057
3
3
3
2.88461538461538
2
1.92307692307692
0
0
63
60.5769230769231
104
Red
10
Dash Dot Dot
20
No
948
Posted
6/8/2022 4:16:10 AM
Take an EXTRA 10% off everything on Jun.15th only!
【Standard Year】Save $271 + FREE crawler + 1-on-1 training
【Professional Year】Save $800 + FREE crawler\*3 + 1-on-1 training\*3
👉 Click to check out the deals: [https://www.octoparse.com/summer-sale-2022/?utm\_source=reddityure&utm\_medium=trailer0608&utm\_campaign=22summersale](https://www.octoparse.com/summer-sale-2022/?utm_source=reddityure&utm_medium=trailer0608&utm_campaign=22summersale)
https://preview.redd.it/xutdkfefpb491.png?width=800&format=png&auto=webp&v=enabled&s=65df540332a9e5c4bd6da8ee6f15424d614f278f
v7gmfh
u_Octoparseideas
Octoparseideas
t3_v7gmfh
https://www.reddit.com/r/u_Octoparseideas/comments/v7gmfh/octoparse_summer_sale_sneak_peek/
6/8/2022 4:16:10 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
🤩 Octoparse Summer Sale Sneak Peek
False
1
v7gmfh
0
32057
3
3
2
2.8169014084507
0
0
0
0
44
61.9718309859155
71
Red
10
Dash Dot Dot
20
No
947
Posted
12/1/2021 2:28:13 AM
[https://youtu.be/aEh1coudY9s](https://youtu.be/aEh1coudY9s)
In this tutorial, I’ll show you how to use web scraping templates in Octoparse 8.4 to extract Amazon product reviews in 3 easy steps.
The ready-to-use template is a unique feature of Octoparse. They are prebuilt crawlers that can be used to scrape popular websites such as Amazon, Facebook and many more. Since all the data fields are pre-set, there’s no need to configure the crawlers by yourself. Simply enter the search value and it will fetch the data for you right away.
The Amazon Template we picked in this video need to gather ASINs, review date time for gathering the rating, review date, review text of the product.
The sample output gives you an idea of what the end result will look like when it completes. We should be able to get all the information in a nice and structured format!
r643bf
u_Octoparseideas
Octoparseideas
t3_r643bf
https://www.reddit.com/r/u_Octoparseideas/comments/r643bf/how_to_scrape_amazon_product_reviews_in_three/
12/1/2021 2:28:13 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Amazon product reviews in three easy steps
False
1
r643bf
0
32057
3
3
5
3.20512820512821
0
0
0
0
75
48.0769230769231
156
Red
10
Dash Dot Dot
20
No
946
Posted
6/25/2021 7:27:54 AM
[removed]
o7it3n
u_Octoparseideas
Octoparseideas
t3_o7it3n
https://www.reddit.com/r/u_Octoparseideas/comments/o7it3n/scrape_yahoo_finance_market_data_for_free/
6/25/2021 7:27:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Scrape Yahoo Finance Market Data for Free
False
1
o7it3n
0
32057
3
3
Red
10
Dash Dot Dot
20
No
945
Posted
6/18/2021 7:31:34 AM
[removed]
o2jfp6
u_Octoparseideas
Octoparseideas
t3_o2jfp6
https://www.reddit.com/r/u_Octoparseideas/comments/o2jfp6/how_to_scrape_qa_sites_like_quora/
6/18/2021 7:31:34 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Q&A Sites like Quora
False
1
o2jfp6
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
944
Posted
11/5/2021 8:52:28 AM
[removed]
qn6u2g
u_Octoparseideas
Octoparseideas
t3_qn6u2g
https://www.reddit.com/r/u_Octoparseideas/comments/qn6u2g/extract_emails_from_any_website_for_cold_email/
11/5/2021 8:52:28 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Extract Emails from Any Website for Cold Email Marketing
False
1
qn6u2g
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
943
Posted
10/29/2021 8:36:08 AM
https://www.octoparse.com/blog/introducing-the-new-octoparse-84/?re=
qi88b9
webscraping
Octoparseideas
t3_qi88b9
https://www.reddit.com/r/webscraping/comments/qi88b9/introducing_the_new_octoparse_84/
10/29/2021 8:36:08 AM
1/1/0001 12:00:00 AM
False
False
3
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Introducing the New Octoparse 8.4
False
0.71
qi88b9
0
32057
3
3
Red
10
Dash Dot Dot
20
No
942
Posted
9/9/2021 2:17:49 AM
[https://www.octoparse.com/blog/whats-new-about-octoparse-842/?re=](https://www.octoparse.com/blog/whats-new-about-octoparse-842/?re=)
Octoparse users, how’s your web scraping journey with the software? In late September, version 8.4.2 of the product will be released. Want to know what’s new about the coming latest version? Keep reading!
# 1. Zapier integration
In version 8.4.2, you can auto-export your cloud data with [Zapier](https://zapier.com/) to Google Drive, Google Sheet, and more software.
https://preview.redd.it/gorj0bg70em71.png?width=600&format=png&auto=webp&v=enabled&s=f2a4c380b46de62886668d59a3a25be2a3882b1c
[Find more information here and have a try.](https://zapier.com/apps/google-drive/integrations/octoparse)
# 2. Scrape while scrolling within a certain section
Take Google Maps as an example. You can enter the webpage and scrape the search results only using this feature in version 8.4.2. The feature can be implemented by setting up the [Xpath](https://www.octoparse.com/blog/what-is-xpath-and-how-to-use-it-in-octoparse).
https://preview.redd.it/kvol52h90em71.png?width=599&format=png&auto=webp&v=enabled&s=31c1df410af7eeca75bd2a252d21e4bdc58d402f
# 3. Customize the user agent
You can change the user agent string and the user agent name on browsers when using version 8.4.2 to scrape data.
To understand how user agents work, this article can be helpful: [How to Change User Agents in Chrome, Edge, Safari & Firefox](https://www.searchenginejournal.com/change-user-agent/368448/)
# 4. Backup local data to the Cloud
This feature used to be available for enterprise users only. In the new 8.4.2 version, it is open to users with professional plans as well.
# 5. Formatting the timestamp
This feature is mainly designed for scraping social media platforms. [Converting the timestamp of the posts to date](https://timestamp.online/) is available in version 8.4.2.
# 6. Other updates in existing features and the UI
With the updates, version 8.4.2 will be more stable and convenient to use compared to former versions.
Don’t hesitate to contact us at [support@octoparse.com](mailto:support@octoparse.com) or [submit a ticket](https://helpcenter.octoparse.com/hc/en-us/requests/new) here if you have any questions. The customer service team will be ready to help you as always. Wish you an even happier scraping then!
pkolv8
u_Octoparseideas
Octoparseideas
t3_pkolv8
https://www.reddit.com/r/u_Octoparseideas/comments/pkolv8/whats_new_about_octoparse_842/
9/9/2021 2:17:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What’s New about Octoparse 8.4.2?
False
1
pkolv8
0
32057
3
3
11
3.01369863013699
2
0.547945205479452
0
0
168
46.027397260274
365
Red
10
Dash Dot Dot
20
No
941
Posted
9/23/2021 1:47:35 AM
[https://youtu.be/j-kcLl\_6WZk](https://youtu.be/j-kcLl_6WZk)
Yellowpages is the largest business directory website. We can extract data like business phone numbers, addresses, emails, and anything that appears on the page using Octoparse.
Today, we will continue the eCommerce series and talk about how to get sales leads from the local community from Yellowpages and increase your brand awareness by reaching more local audiences to purchase your services or product.
https://reddit.com/link/ptkxgb/video/mcspog8pq5p71/player
ptkxgb
Octoparse_ideas
Octoparseideas
t3_ptkxgb
https://www.reddit.com/r/Octoparse_ideas/comments/ptkxgb/how_to_acquire_sales_leads_from_the_local/
9/23/2021 1:47:35 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Acquire Sales Leads from the Local Community from Yellowpages?
False
1
ptkxgb
0
32057
3
3
1
1.35135135135135
0
0
0
0
40
54.0540540540541
74
Red
10
Dash Dot Dot
20
No
940
Posted
8/9/2021 3:04:53 AM
[removed]
p0t01t
u_Octoparseideas
Octoparseideas
t3_p0t01t
https://www.reddit.com/r/u_Octoparseideas/comments/p0t01t/how_to_scrape_amazon_with_octoparse/
8/9/2021 3:04:53 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Amazon with Octoparse
False
1
p0t01t
0
32057
3
3
Red
10
Dash Dot Dot
20
No
939
Posted
9/14/2022 3:02:59 AM
If you want to monitor your website's ranking on Google, analyze your competitors, or analyze paid ads on Google, then scraping the search results is the best way to get started.
In this article, we are going to learn about 3 different ways that you can use to scrape Google Search Results.
[https://www.octoparse.com/blog/scrape-google-search-results/?utm\_source=2022q3&utm\_medium=scrape-google-search-results&utm\_campaign=reddit](https://www.octoparse.com/blog/scrape-google-search-results/?utm_source=2022q3&utm_medium=scrape-google-search-results&utm_campaign=reddit)
xdqp0d
Octoparse_ideas
Octoparseideas
t3_xdqp0d
https://www.reddit.com/r/Octoparse_ideas/comments/xdqp0d/3_easy_ways_to_scrape_google_search_results/
9/14/2022 3:02:59 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
3 Easy Ways to Scrape Google Search Results
False
1
xdqp0d
0
32057
3
3
1
1.05263157894737
0
0
0
0
58
61.0526315789474
95
Red
10
Dash Dot Dot
20
No
938
Posted
6/10/2021 6:57:02 AM
[removed]
nwh8pk
u_Octoparseideas
Octoparseideas
t3_nwh8pk
https://www.reddit.com/r/u_Octoparseideas/comments/nwh8pk/how_to_scrape_realtime_data/
6/10/2021 6:57:02 AM
1/1/0001 12:00:00 AM
False
False
0
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Real-Time Data
False
0.5
nwh8pk
0
32057
3
3
Red
10
Dash Dot Dot
20
No
937
Posted
6/23/2022 3:42:30 AM
To download the image for the link, you may want to look into "Bulk Image Downloaders". Inspired by the inquiries received, I decided to make a "top 5 bulk image downloader" list for you. Be sure to check out this article if you want to download images from links at zero cost.
[https://www.octoparse.com/blog/bulk-download-images-from-links-top-5-bulk-image-downloaders/?utm\_source=sale2022&utm\_medium=top5imagedownloaders&utm\_campaign=reddit](https://www.octoparse.com/blog/bulk-download-images-from-links-top-5-bulk-image-downloaders/?utm_source=sale2022&utm_medium=top5imagedownloaders&utm_campaign=reddit)
viniro
Octoparse_ideas
Octoparseideas
t3_viniro
https://www.reddit.com/r/Octoparse_ideas/comments/viniro/top_5_image_downloaders_to_download_from_any/
6/23/2022 3:42:30 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 Image Downloaders to Download from Any Websites/URLs
False
1
viniro
0
32057
3
3
3
2.97029702970297
0
0
0
0
57
56.4356435643564
101
Red
10
Dash Dot Dot
20
No
936
Posted
6/24/2022 8:36:41 AM
The online job market has undoubtedly overridden in-person hiring activities. This is especially true when most cities around the globe face rounds of lock-down and more jobs shift to a remote mode since COVID-19. In this sense, scraping job postings data helps not only institutions and organizations but also individual job seekers.
[https://www.octoparse.com/blog/web-scraping-job-postings/?utm\_source=sale2022&utm\_medium=webscrapingjobpostings&utm\_campaign=reddit](https://www.octoparse.com/blog/web-scraping-job-postings/?utm_source=sale2022&utm_medium=webscrapingjobpostings&utm_campaign=reddit)
vjjoor
Octoparse_ideas
Octoparseideas
t3_vjjoor
https://www.reddit.com/r/Octoparse_ideas/comments/vjjoor/a_complete_guide_to_web_scraping_job_postings/
6/24/2022 8:36:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
A Complete Guide to Web Scraping Job Postings
False
1
vjjoor
0
32057
3
3
Red
10
Dash Dot Dot
20
No
935
Posted
9/13/2021 9:01:17 AM
Travel rules are currently changing with the Covid case curve. With the disease’s Delta variant, the cases are rising. As I am compiling this article, the EU is considering reimposing travel restrictions on U.S. visitors.
Anyway, I have built my Tripadvisor scraper with Octoparse and crawled down the information of destinations that are open to U.S. citizens. Always get prepared for a refreshing trip.
Note: If you are setting out to [these countries](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html), you may want to check if vaccination or quarantine is needed.
By the way, [web scraping](https://en.wikipedia.org/wiki/Web_scraping) is definitely the best way to help us pull down the web data and so we can sift through it and get the most value out of it. I will be showing how it helps me get the travel data.
[Geo Map generated by mapchart.net](https://preview.redd.it/qed95mdmj8n71.png?width=698&format=png&auto=webp&v=enabled&s=c99add0b04792a5b656841703f28f8a68c58a081)
# Web Scraping Travel Data
Do you have any idea about [big data in tourism](https://www.octoparse.com/blog/big-data-in-tourism)?
Business guys in the travel industry are tracking all kinds of data, for example, business data of travel agents and visitors’ behavioral data on all travel-related platforms. They may know your traveling habits better than you. The whole industry is leveraging big data to launch the right product, find the right people to pay for their services.
Web scraping is the tech that makes this possible.
Well as a traveler, I want web scraping travel data to serve my needs — find destinations among the most attractive and get the guides from Tripadvisor for my reference.
**What I am going to do**
* First of all, I need a list of countries to look into.
* Secondly, I will use a web scraping tool, Octoparse, to build a Tripadvisor scraper and crawl these countries’ travel data.
* Finally, I am going to pack my baggage and head for the destination that fits my travel taste most!
# Where Can an American Go
So, where can an American go for travel now?
[This article by CNN](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html) listed the destinations that are open to the U.S.(the list might be updating now and then).
What I wanted to do is to pull all the country names on this web page down into a spreadsheet so I can paste them into Octoparse to get more specific data from Tripadvisor.
[Octoparse: How to get list information on a web page into excel](https://preview.redd.it/rwk0eqipj8n71.png?width=699&format=png&auto=webp&v=enabled&s=475dfc66e71e66eee1560b80190bc46645bce67f)
Octoparse can easily get list information on a web page into excel or CSV.
This is extremely helpful when you want to get a list of URLs or a list of data, which you want to paste and search on another platform, or import into a data analytics software for analysis.
Now that I have got the text list of destinations, I am going to build a TripAdvisor scraper to get specific data about these places.
[Keep reading: build a Tripadvisor scraper](https://www.octoparse.com/blog/tripadvisor-scraper-top-destinations-open-to-the-us-citizens-under-covid/?re=)
pnbqds
Octoparse_ideas
Octoparseideas
t3_pnbqds
https://www.reddit.com/r/Octoparse_ideas/comments/pnbqds/tripadvisor_scraper_top_destinations_open_to_the/
9/13/2021 9:01:17 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Tripadvisor Scraper: Top Destinations Open to the U.S. Citizens under Covid
False
1
pnbqds
0
32057
3
3
Red
10
Dash Dot Dot
20
No
934
Posted
4/28/2023 2:43:31 AM
The stock market is known for its rapid changes in response to various factors. To make smarter investments amidst these fluctuations, data can provide valuable insights. Check out this article to learn how to scrape data from #Yahoo! Finance's stock quotes page, clean and analyze the data, and identify stocks with rising or falling values.
[https://www.octoparse.com/blog/scraping-and-cleansing-yahoo-finance-data#](https://www.octoparse.com/blog/scraping-and-cleansing-yahoo-finance-data#)
131fe9c
Octoparse_ideas
Octoparseideas
t3_131fe9c
https://www.reddit.com/r/Octoparse_ideas/comments/131fe9c/unlocking_the_potential_of_data_scraping_a/
4/28/2023 2:43:31 AM
1/1/0001 12:00:00 AM
False
False
3
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Unlocking the Potential of Data Scraping: A Profitable Guide to Investing in Stocks
False
1
131fe9c
0
32057
3
3
4
5.19480519480519
1
1.2987012987013
0
0
43
55.8441558441558
77
Red
10
Dash Dot Dot
20
No
933
Posted
9/13/2021 8:57:10 AM
Travel rules are currently changing with the Covid case curve. With the disease’s Delta variant, the cases are rising. As I am compiling this article, the EU is considering reimposing travel restrictions on U.S. visitors.
Anyway, I have built my Tripadvisor scraper with Octoparse and crawled down the information of destinations that are open to U.S. citizens. Always get prepared for a refreshing trip.
Note: If you are setting out to [these countries](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html), you may want to check if vaccination or quarantine is needed.
By the way, [web scraping](https://en.wikipedia.org/wiki/Web_scraping) is definitely the best way to help us pull down the web data and so we can sift through it and get the most value out of it. I will be showing how it helps me get the travel data.
[Geo Map generated by mapchart.net](https://preview.redd.it/wncnyazsi8n71.png?width=698&format=png&auto=webp&v=enabled&s=5f43df49ffb9098feb9a5337f2cdb702e683c057)
# Web Scraping Travel Data
Do you have any idea about [big data in tourism](https://www.octoparse.com/blog/big-data-in-tourism)?
Business guys in the travel industry are tracking all kinds of data, for example, business data of travel agents and visitors’ behavioral data on all travel-related platforms. They may know your traveling habits better than you. The whole industry is leveraging big data to launch the right product, find the right people to pay for their services.
Web scraping is the tech that makes this possible.
Well as a traveler, I want web scraping travel data to serve my needs — find destinations among the most attractive and get the guides from Tripadvisor for my reference.
**What I am going to do**
* First of all, I need a list of countries to look into.
* Secondly, I will use a web scraping tool, Octoparse, to build a Tripadvisor scraper and crawl these countries’ travel data.
* Finally, I am going to pack my baggage and head for the destination that fits my travel taste most!
# Where Can an American Go
So, where can an American go for travel now?
[This article by CNN](https://edition.cnn.com/travel/article/us-international-travel-covid-19/index.html) listed the destinations that are open to the U.S.(the list might be updating now and then).
What I wanted to do is to pull all the country names on this web page down into a spreadsheet so I can paste them into Octoparse to get more specific data from Tripadvisor.
[Octoparse: How to get list information on a web page into excel](https://preview.redd.it/47rtyaowi8n71.png?width=699&format=png&auto=webp&v=enabled&s=2f55d1b98be04eb9b9e71e295bd7fb3a44acbb88)
Octoparse can easily get list information on a web page into excel or CSV.
This is extremely helpful when you want to get a list of URLs or a list of data, which you want to paste and search on another platform, or import into a data analytics software for analysis.
Now that I have got the text list of destinations, I am going to build a TripAdvisor scraper to get specific data about these places.
# Build a TripAdvisor Scraper
The data I am going to crawl from Tripadvisor:
* I want to check the travel popularity of these countries. I will consult with the number of reviews about the country on Tripadvisor. (My hypothesis: more visits, more reviews.)
* I have my travel theme. I am a nature lover interested in outdoor events and nature sightseeing. I will get the tag information of these destinations so that I can filter through and niche down to the perfect place where I can chase the wind, play on the beach or appreciate the grandeur of a peak.
* I will save the URL of travel guides on Tripadvisor for further travel planning. (Thanks contributors!)
## Batch Generate URLs with Country Names
Where to get this data? This is a sample page: [Tripadvisor Nepal](https://www.tripadvisor.com/Search?q=Nepal&searchSessionId=628D87C594BA0F3C2D5F64F9187E6C0E1630569008168ssid&sid=CE17A104D3744921A306A608605241AB1630574430004&blockRedirect=true&ssrc=a&geo=1&rf=2).
With the list of country names I have scraped in the previous step, I can batch generate all Tripadvisor country pages with Octoparse.
[Keep reading: how to create your own TripAdvisor scraper](https://www.octoparse.com/blog/tripadvisor-scraper-top-destinations-open-to-the-us-citizens-under-covid/?re=)
pnbokm
u_Octoparseideas
Octoparseideas
t3_pnbokm
https://www.reddit.com/r/u_Octoparseideas/comments/pnbokm/tripadvisor_scraper_top_destinations_open_to_the/
9/13/2021 8:57:10 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Tripadvisor Scraper: Top Destinations Open to the U.S. Citizens under Covid
False
1
pnbokm
0
32057
3
3
Red
10
Dash Dot Dot
20
No
932
Posted
5/18/2022 3:36:24 AM
How much do you know about web scraping? Don't worry even if you are new to this concept. As in this article, we will brief you on the basics of web scraping, teach you how to assess web scraping tools to get one that best fits your needs, and last but not least, present a list of web scraping tools for your reference.
[https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss/?utm\_source=sale2022&utm\_medium=10freewebscrapers&utm\_campaign=reddit](https://www.octoparse.com/blog/9-free-web-scrapers-that-you-cannot-miss/?utm_source=sale2022&utm_medium=10freewebscrapers&utm_campaign=reddit)
us3vu2
u_Octoparseideas
Octoparseideas
t3_us3vu2
https://www.reddit.com/r/u_Octoparseideas/comments/us3vu2/10_free_web_scrapers_that_you_cannot_miss_in_2022/
5/18/2022 3:36:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
10 FREE Web Scrapers That You Cannot Miss in 2022
False
1
us3vu2
0
32057
3
3
Red
10
Dash Dot Dot
20
No
931
Posted
6/24/2021 2:41:09 AM
[removed]
o6rv4p
u_Octoparseideas
Octoparseideas
t3_o6rv4p
https://www.reddit.com/r/u_Octoparseideas/comments/o6rv4p/how_to_extract_and_monitor_stock_prices_from/
6/24/2021 2:41:09 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Extract and Monitor Stock Prices from Yahoo! Finance
False
1
o6rv4p
0
32057
3
3
0
0
0
0
0
0
1
100
1
Red
10
Dash Dot Dot
20
No
930
Posted
2/20/2023 9:51:24 AM
One week after ChatGPT launch, it reached 1 million users and more than 57 million monthly users in the first month. The impressive performance of ChatGPT makes the public consider how it will change lives and be concerned if it may replace jobs anytime soon. Check out this article to learn more about "how ChatGPT will affect web scraping tools".
[https://www.octoparse.com/blog/chatgpt-and-scraping-tools](https://www.octoparse.com/blog/chatgpt-and-scraping-tools)
1172t0k
Octoparse_ideas
Octoparseideas
t3_1172t0k
https://www.reddit.com/r/Octoparse_ideas/comments/1172t0k/will_chatgpt_replace_web_scraping_tools_here_is/
2/20/2023 9:51:24 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Will ChatGPT Replace Web Scraping Tools? Here is Our Answer
False
1
1172t0k
0
32057
3
3
1
1.28205128205128
1
1.28205128205128
0
0
44
56.4102564102564
78
Red
10
Dash Dot Dot
20
No
929
Posted
9/8/2021 1:33:26 AM
A place for members of r/Octoparse_ideas to chat with each other
pk0ql2
Octoparse_ideas
Octoparseideas
t3_pk0ql2
https://www.reddit.com/r/Octoparse_ideas/comments/pk0ql2/roctoparse_ideas_lounge/
9/8/2021 1:33:26 AM
1/1/0001 12:00:00 AM
False
False
1
1
Silver:0 Gold:0 Platinum:0 Count:0
False
False
r/Octoparse_ideas Lounge
False
0.67
pk0ql2
0
32057
3
3
Red
10
Dash Dot Dot
20
No
928
Posted
6/8/2021 11:25:44 AM
[removed]
nv1snc
u_Octoparseideas
Octoparseideas
t3_nv1snc
https://www.reddit.com/r/u_Octoparseideas/comments/nv1snc/build_a_reddit_image_scraper_without_coding/
6/8/2021 11:25:44 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Build a Reddit Image Scraper without Coding
False
1
nv1snc
0
32057
3
3
Red
10
Dash Dot Dot
20
No
927
Posted
1/26/2022 9:55:12 AM
*Originally published as* [https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases/?re=](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases/?re=) *on January 24, 2022.*
A basic intro to lead you in the world of web scraping. What is web scraping? How does it work, how is it used? What are the pros and cons? All questions that concern you will be answered here.
# What is web scraping?
Web scraping is a way to download data from web pages.
You may have heard some of its nicknames like data scraping, data extraction, or web crawling. (web crawling could be narrower and refer to data scraping done by search engine bots) In most cases, they refer to the same meaning — a programmatic way to pull data from the web.
Web scraping helps fetch data (like emails, phone numbers, articles, etc.) from web pages and organize it into certain formats like Excel, CSV or HTML, etc.
See how [Wikipedia explains web scraping](https://en.wikipedia.org/wiki/Web_scraping):
>*“The content of a page may be parsed, searched, reformatted, its data copied into a spreadsheet or loaded into a database. Web scrapers typically take something out of a page, to make use of it for another purpose somewhere else. An example would be to find and copy names and telephone numbers, or companies and their URLs, or e-mail addresses to a list (contact scraping).”*
In essence, web scraping is a dedicated data collector who captures the exact set of data you want from a load of web pages and makes it into a neat file for your download and further use.
## What’s the point of web scraping?
Big Data and Automation are no longer new concepts in the current business world. They are widely used techniques to improve people’s effectiveness and efficiency.
Big data is big for the amount. Automation is about getting things done on autopilot. And web scraping is good at both — getting voluminous data fast with little human labor required.
In the context of big data collection, web scraping is the rescue. If you want to train a machine learning model, a great amount of accurate input data will make you smile. This data will teach your model important lessons and get you a more intelligent algorithm.
That’s when web scraping plays the ace — to grab you data efficiently from a number of websites and get it into a machine-readable format for quick use.
Well, not everyone has an AI model to train, but most of us need to collect data for different purposes. Web scraping’s nature of automation extremely improves people’s working efficiency and eliminates human errors. Lay back and let the robot do what is repetitive.
When you get to the [use cases#](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases#web_scraping2), you will find out how web scraping helps in real cases.
## How does web scraping work?
Web pages’ data is written in the HTML file. Browsers like Chrome and Firefox are tools that read the HTML file to us.
Therefore, no matter how diverse web pages are presented to us, every string of data we see in the browser is already written in the HTML source code. Whatever you see can be traced and located in the code (by Xpath, a language used to locate an element).
Web scraping finds the right data according to where it locates and takes a series of actions (such as extract the selected text, extract the hyperlink, input a preset data and click certain buttons, etc.) just like a human, except that it surfs the Internet, copy the data fast around the clock and feels no fatigue.
Once the data is ready, you will be able to download it from the cloud or to the local file for any further use.
# How is web scraping used?
Who is doing web scraping and how do they get empowered by web scraping? Here are some use cases. You may discover how web scraping could benefit you as well.
## Who is using web scraping?
Web scraping is widely used in industries like:
* jobs & recruitment
* consultancy
* hotel & travel
* eCommerce & retailing
* finance and more
* marketing
***Tips:*** *Check how to* [*get started#*](https://www.octoparse.com/blog/what-is-web-scraping-basics-and-use-cases#web_scraping4)*, an example of how I started to build my web scraper and got data from Youtube that helped my KOL marketing.*
They are getting data mostly for price/brand monitoring, price comparison, and big data analysis that serve their decision-making process and business strategy.
For individuals, web scraping helps professionals like:
* data scientists
* data journalists
* marketers
* academic researchers
* business analysts
* eCommerce sellers and more
(to obtain data that support their sales, marketing, research, and analysis.)
Does web scraping sound like a big undertaking to you? Believe me, it is not. It can be used in many trivial ways and help you out of tedious, repetitive work. Basically, if you need data that could be found in websites and you don’t want to do mind-numbing copies and pastes manually, you use web scraping.
**Read also:**
* [How Dealogic Gets Empowered with Content Aggregation](https://service.octoparse.com/dealogic-web-scraping-for-content-aggregation)
* [Ecommerce Product Tracking for Successful Reselling](https://service.octoparse.com/amazon-product-monitoring)
* [Web Scraping In Marketing Consultancy](https://service.octoparse.com/pricetrack-consultancy-web-scraping)
* [Web Scraping Manages Inventory Tracking in Retail Industry](https://service.octoparse.com/inventory-web-scraping-blind-rivet-supply)
## What are the most scraped data/websites?
According to [the Most Scraped Websites](https://www.octoparse.com/blog/top-10-most-scraped-websites) by Octoparse, eCommerce marketplaces, directory websites and social media platforms are the most scraped websites in general.
**Websites like Amazon, eBay, Walmart, Yelp, Yellowpages, Craigslist, social media platforms like Facebook, Twitter and LinkedIn are among the popular.**
What data are people getting from these sites? Well, everything that serves their research or sales.
* Online product details like stock, prices, reviews and specifications;
* Business/leads information like stores’ or individuals’ name, email, address, phone number and other information that serve any outbound gestures;
* Discussions on the social media or comments on the review pages that offer data sources for NLP or sentiment analysis.
The need of migrating data is also one of the reasons people choose web scraping. A scraper then works out like a grand CTRL+C action and helps copy data from one place to another for the user.
You may be interested in [web scraping business ideas](https://www.octoparse.com/blog/10-web-scraping-business-ideas-for-everyone) to discover more detailed information about how web scraping is used in practical scenarios.
# The Pros and Cons of Web Scraping
Because of its accuracy and efficiency, web scraping empowers individuals and businesses in many ways. However, worries always exist — will it be too complicated to handle? Is it hard to fix and maintain, etc. Well, fair questions. While if you got the opportunity to dive into it, you will see the advantages of web scraping very likely outweigh what means to you the tricky part.
## The advantages of web scraping
**#High speed**
Getting data faster. This is self-evident and may be the core reason people resort to web scraping. Compared to manually doing this, a web scraper can execute your commands automatically, according to the workflow you have built for it. Each step of work that would have taken up your time will be done by the scraper.
Once you set it up, it will run for you relentlessly, getting all kinds of web data fast from different websites. If you wanna try how fast a scraper can be, I recommend you to try our [scraper templates](https://helpcenter.octoparse.com/hc/en-us/articles/900003158843). You may try an Amazon scraper to gather product details or product reviews and see how a scraper can get you hundreds of well-structured data lines in just a minute.
[Download Octoparse](https://www.octoparse.com/download) to witness the speed of web scraping.
BTW, web scraping is a valu...
sd2f3c
Octoparse_ideas
Octoparseideas
t3_sd2f3c
https://www.reddit.com/r/Octoparse_ideas/comments/sd2f3c/what_is_web_scraping_basics_practical_uses/
1/26/2022 9:55:12 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
What Is Web Scraping — Basics & Practical Uses
False
1
sd2f3c
0
32057
3
3
46
3.36503291880029
17
1.24359912216533
0
0
703
51.4264813460132
1367
Red
10
Dash Dot Dot
20
No
926
Posted
6/23/2022 9:14:49 AM
A social media scraper often refers to an automatic web scraping tool that extracts data from social media channels. In this article, I am going to further illustrate how social media datasets can be used in business and list out the top 5 social media scraping tools I recommend.
[https://www.octoparse.com/blog/top-5-social-media-scraping-tools-for-2021/?utm\_source=sale2022&utm\_medium=socialmediascrapingtools&utm\_campaign=reddit](https://www.octoparse.com/blog/top-5-social-media-scraping-tools-for-2021/?utm_source=sale2022&utm_medium=socialmediascrapingtools&utm_campaign=reddit)
visouv
u_Octoparseideas
Octoparseideas
t3_visouv
https://www.reddit.com/r/u_Octoparseideas/comments/visouv/top_5_social_media_scraping_tools_for_2022/
6/23/2022 9:14:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Top 5 Social Media Scraping Tools for 2022
False
1
visouv
0
32057
3
3
4
4.25531914893617
0
0
0
0
56
59.5744680851064
94
Red
10
Dash Dot Dot
20
No
925
Posted
6/21/2022 2:41:32 AM
Google sheets can be regarded as a basic web scraper. You can use a special formula to extract data from websites, import the data directly to google sheets and share it with your friends. By reading the following parts, you can learn the easy methods on how to build a simple web scraper with Google Sheets.
[https://www.octoparse.com/blog/simple-web-scraping-using-google-sheets/?utm\_source=sale2022&utm\_medium=webscrapinggooglesheets&utm\_campaign=reddit](https://www.octoparse.com/blog/simple-web-scraping-using-google-sheets/?utm_source=sale2022&utm_medium=webscrapinggooglesheets&utm_campaign=reddit)
vh2r7y
u_Octoparseideas
Octoparseideas
t3_vh2r7y
https://www.reddit.com/r/u_Octoparseideas/comments/vh2r7y/simple_web_scraping_using_google_sheets_2022/
6/21/2022 2:41:32 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Simple Web Scraping using Google Sheets (2022 updated)
False
1
vh2r7y
0
32057
3
3
1
1.03092783505155
0
0
0
0
61
62.8865979381443
97
Red
10
Dash Dot Dot
20
No
924
Posted
10/29/2021 8:35:41 AM
https://www.octoparse.com/blog/introducing-the-new-octoparse-84/?re=
qi882s
Octoparse_ideas
Octoparseideas
t3_qi882s
https://www.reddit.com/r/Octoparse_ideas/comments/qi882s/introducing_the_new_octoparse_84/
10/29/2021 8:35:41 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Introducing the new Octoparse 8.4
False
1
qi882s
0
32057
3
3
Red
10
Dash Dot Dot
20
No
923
Posted
4/1/2022 1:56:13 AM
In recent years, big data has become the new gold and led the trends of data collection and data analysis. Web scraping or web data extraction has become a popular way for collecting web data. While being well recognized for its flexibility and adaptability, this new technology has helped many individuals and businesses to retrieve loads of data from nearly all websites or databases.
However, web scraping is not as welcome for website owners on the hand. It can increase heavy loads of traffic to the websites’ servers which can potentially crash the sites in the worst scenarios. As a result, with new technologies being developed for web scraping, the means of defense against it has become more sophisticated as well.
The most common way to fight back web scraping is to limit the access rate of any single IP. A web scraper that has made too many requests in a short period of time using a single IP address can be easily detected, and sooner or later get blocked by the target website. To reduce the chances of getting blocked, we should try to avoid scraping a website with a single IP address. The easiest way is to use proxy servers. In this article, we will introduce what is a proxy server and some popular web scrapers that have IP proxy features.
https://preview.redd.it/sqkwutfdqtq81.png?width=764&format=png&auto=webp&v=enabled&s=07e2a0c22d2709012d20ba321558f7b9105001c4
# What is a proxy server
The word proxy means “to act on behalf of another,” and a proxy server acts on behalf of the user. When we browse a web page, a proxy is a system that provides a gateway between end-users and the web pages we visit online. Therefore, it helps prevent cyber attackers from entering a private network.
When a computer connects to the internet, it uses an IP address. This is similar to your home’s street address, telling incoming data where to go and marking outgoing data with a return address for other devices to authenticate. A proxy server is essentially a computer on the internet that has an IP address of its own. All requests to the Internet go to the proxy server first, which evaluates the request and forwards it to the Internet. Likewise, responses come back to the proxy server and then to the user. Therefore, proxy servers provide varying levels of functionality, security, and privacy depending on your use case, needs, or company policy.
# How does proxy server work for web scraping
As we mentioned above, websites usually block the IP addresses you use to access them. So using a proxy server is a good solution as the server has its own IP address and can protect yours. When using a proxy, the website you are making the request to no longer sees your IP address but the IP address of the proxy, giving you the ability to scrape the web anonymously.
Using a proxy pool allows you to scrape a website much more reliably and significantly reduce the chances that your crawlers will get banned. You need to build a proxy pool, which includes different proxy IP addresses to rotate. Integrate your proxy pool with your web scraping tool or script and you can get the web data under protection from blocking problems.
# Web scraping tools with proxy features
IP proxy works quite effectively for bypassing website blocks and an easy way to make use of IP proxy is to opt for web scraping tools that are already offering such proxy features, like [Octoparse](https://www.octoparse.com/). These tools can be deployed with the IP proxies at your disposal or with the IP proxy resources built into the specific tools.
It is always recommended to use a web scraping tool that runs with IP proxies when you need to scrape websites that use some kind of anti-scraping measures. Some popular scraper tools out there include [Octoparse,](https://www.octoparse.com/) Mozenda, Parsehub, and Screen Scraper.
## Octoparse
[**Octoparse**](https://www.octoparse.com/) is a powerful and free web scraping tool that can scrape almost all websites. Its cloud-based data extraction runs with a large pool of Cloud IP addresses which minimizes the chances of getting blocked and protects your local IP addresses. The newly released version, Octoparse 8.5, has [multiple country-based IP pools](https://helpcenter.octoparse.com/hc/en-us/articles/4417928258201-Switch-Cloud-IP-for-a-task-Version-8-5-) to choose from so you can effectively scrape websites that are only accessible to IPs of a specific region/country. With Octoparse, even when you run the crawler on your local device, you can still use a list of custom proxies to run the crawler to avoid revealing your real IP. (Here is a tutorial that introduces how to [set up proxies](https://helpcenter.octoparse.com/hc/en-us/articles/4408402059289-Set-up-IP-proxy) in Octoparse.)
## Mozenda
**Mozenda** is also an easy-to-use desktop data scraper. It offers geolocation proxies and custom proxies for users to choose from. Geolocation proxies allow you to route your crawler’s traffic through another part of the world so you can access region-specific information. When standard geolocation doesn’t meet your project requirements, you can connect to proxies from a third-party provider via custom proxies.
## Parsehub
**Parsehub** is an easy-to-learn, visual tool for gathering data from the web which also allows cloud scraping and IP rotation. After you enable IP rotation for your projects, proxies used to run your project come from many different countries. Additionally, you have the option to add your own list of custom proxies to ParseHub as part of the IP rotation feature if you would like to access a website from a particular country or if you would prefer to use your own proxies instead of the ones it provides for IP rotation.
## Apify
**Apify** is a web scraping and automation platform to collect data. It not only offers data collection service but also a proxy service reducing the blocking of your web scraping. Apify Proxy provides access to both residential and datacenter IP addresses. Datacenter IPs are fast and cheap but might be blocked by target websites. Residential IPs are more expensive and harder to block.
Now you should have a basic understanding of what a proxy server is and how it can be used for web scraping. Even though proxy makes web scraping more efficient, keeping the scraping speed under control and avoiding overloading your target websites is also important. Living in peace with websites and not breaking the balance will help you get the data continuously.
*Originally published as* [*https://www.octoparse.com/blog/proxy-server-for-web-scraping/?re=*](https://www.octoparse.com/blog/proxy-server-for-web-scraping/?re=) *on March 30, 2022.*
ttfhu1
Octoparse_ideas
Octoparseideas
t3_ttfhu1
https://www.reddit.com/r/Octoparse_ideas/comments/ttfhu1/use_proxy_server_for_web_scraping/
4/1/2022 1:56:13 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Use Proxy Server for Web Scraping
False
1
ttfhu1
0
32057
3
3
32
2.84951024042743
11
0.979519145146928
0
0
584
52.0035618878005
1123
Red
10
Dash Dot Dot
20
No
922
Posted
11/16/2021 3:52:22 AM
Know everything you need as Octoparse beginners:
[https://helpcenter.octoparse.com/hc/en-us/articles/4409179533465-Join-our-Beginner-Academy-Learn-web-scraping-in-the-community/?re=](https://helpcenter.octoparse.com/hc/en-us/articles/4409179533465-Join-our-Beginner-Academy-Learn-web-scraping-in-the-community/?re=)
If you are new to web scraping with Octoparse and have found it a bit tricky to learn alone, we are holding the Beginner Academy as the finale of 2021.
In the community, you will have access to onboarding courses, exercises, and a community group where you can discuss with fellow learners and submit questions to our professional support.
quyqm7
u_Octoparseideas
Octoparseideas
t3_quyqm7
https://www.reddit.com/r/u_Octoparseideas/comments/quyqm7/join_our_beginner_academy_learn_web_scraping_in/
11/16/2021 3:52:22 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Join our Beginner Academy: Learn web scraping in the community
False
1
quyqm7
0
32057
3
3
1
0.943396226415094
1
0.943396226415094
0
0
56
52.8301886792453
106
Red
10
Dash Dot Dot
20
No
921
Posted
6/14/2022 6:47:17 AM
Take an EXTRA 10% off everything on Jun.15th only!
【Standard Year】Save $271 + FREE crawler + 1-on-1 training
【Professional Year】Save $800 + FREE crawler\*3 + 1-on-1 training\*3
👉 Click to check out the deals: [https://www.octoparse.com/summer-sale-2022/?utm\_source=reddityure&utm\_medium=startingin1day&utm\_campaign=22summersale](https://www.octoparse.com/summer-sale-2022/?utm_source=reddityure&utm_medium=startingin1day&utm_campaign=22summersale)
https://preview.redd.it/1ozdpmgt9j591.png?width=800&format=png&auto=webp&v=enabled&s=a5fe43f65c82d0820f281897228c03a1163bd9d0
vbxcsn
u_Octoparseideas
Octoparseideas
t3_vbxcsn
https://www.reddit.com/r/u_Octoparseideas/comments/vbxcsn/octoparse_summer_sale_starts_in_1_day/
6/14/2022 6:47:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
💥 Octoparse Summer Sale Starts in 1 Day
False
1
vbxcsn
0
32057
3
3
2
2.8169014084507
0
0
0
0
44
61.9718309859155
71
Red
10
Dash Dot Dot
20
No
920
Posted
12/28/2021 4:00:35 AM
https://youtu.be/OneU-njIsXE
rq64wm
Octoparse_ideas
Octoparseideas
t3_rq64wm
https://www.reddit.com/r/Octoparse_ideas/comments/rq64wm/how_to_scrape_app_reviews_from_google_play_with/
12/28/2021 4:00:35 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape app reviews from Google Play with Octoparse
False
1
rq64wm
0
32057
3
3
Red
10
Dash Dot Dot
20
No
919
Posted
8/29/2022 7:51:25 AM
If you do not have any technical knowledge about web scraping and need a tool with multiple and incredible options, just give a read to this article as we are going to discuss a free Quora scraper. With this free tool, you will be able to get Quora data in the form of JSON or CSV files.
[https://www.octoparse.com/blog/web-scraping-quora/?utm\_source=2022q3&utm\_medium=web-scraping-quora&utm\_campaign=reddit](https://www.octoparse.com/blog/web-scraping-quora/?utm_source=2022q3&utm_medium=web-scraping-quora&utm_campaign=reddit)
x0ivn8
u_Octoparseideas
Octoparseideas
t3_x0ivn8
https://www.reddit.com/r/u_Octoparseideas/comments/x0ivn8/how_to_scrape_questions_and_answers_data_from/
8/29/2022 7:51:25 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Questions and Answers Data from Quora
False
1
x0ivn8
0
32057
3
3
3
3.125
0
0
0
0
51
53.125
96
Red
10
Dash Dot Dot
20
No
918
Posted
9/7/2022 2:53:49 AM
Zillow is one of the most popular websites used to search for homes, check home values, and find real estate agents. It also has a lot of data about local homes, their prices, and the realtors. That’s why scraping Zillow data is great to use in your tools and third-party applications for commercial real estate needs. The data we scrape from Zillow will contain information about the list of houses for sale in any city in their database.
[https://www.octoparse.com/blog/how-to-scrape-zillow-data/?utm\_source=2022q3&utm\_medium=how-to-scrape-zillow-data&utm\_campaign=reddit](https://www.octoparse.com/blog/how-to-scrape-zillow-data/?utm_source=2022q3&utm_medium=how-to-scrape-zillow-data&utm_campaign=reddit)
x7ton3
Octoparse_ideas
Octoparseideas
t3_x7ton3
https://www.reddit.com/r/Octoparse_ideas/comments/x7ton3/zillow_scraper_scrape_zillow_real_estate_data_for/
9/7/2022 2:53:49 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Zillow Scraper: Scrape Zillow Real Estate Data for Free
False
1
x7ton3
0
32057
3
3
2
1.5748031496063
0
0
0
0
68
53.5433070866142
127
Red
10
Dash Dot Dot
20
No
917
Posted
11/11/2021 3:01:01 AM
Octoparse salutes Black Friday: [https://youtu.be/En7MS6lo8WQ](https://youtu.be/En7MS6lo8WQ)
This Black Friday with Octoparse! Lower price and new version!
Save up 30-40% from 11.17-12.03 2021 (23:59 EST.)
Extra 10-15% OFF on the first day 11.17th EST ONLY
And get free giveaways: crawler+training
New 8.4 new experience:
8.4 updates with cool new features: Add custom user agent, page scroll-down, Zapier integration
Faster engine, more intuitive layout, and robust exportation
Tune in Octoparse for the coming Black Friday and save big!
qrb8lc
Octoparse_ideas
Octoparseideas
t3_qrb8lc
https://www.reddit.com/r/Octoparse_ideas/comments/qrb8lc/octoparse_salutes_black_friday/
11/11/2021 3:01:01 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Octoparse salutes Black Friday
False
1
qrb8lc
0
32057
3
3
Red
10
Dash Dot Dot
20
No
916
Posted
8/25/2022 8:45:17 AM
LinkedIn job postings are a treasure trove of information and are among the most effective ways to build your network and attract new connections. The big social media site has tens of millions of potential candidates, and it's also one of the best places to find open job positions. But finding these postings can really be time-consuming if you're doing it manually. This article shows you how to scrape LinkedIn job postings, including a list of all current job posts as well as a way to search for specific jobs.
[https://www.octoparse.com/blog/linkedin-job-scraper/?utm\_source=2022q3&utm\_medium=linkedin-job-scraper&utm\_campaign=reddit](https://www.octoparse.com/blog/linkedin-job-scraper/?utm_source=2022q3&utm_medium=linkedin-job-scraper&utm_campaign=reddit)
wx8ujr
Octoparse_ideas
Octoparseideas
t3_wx8ujr
https://www.reddit.com/r/Octoparse_ideas/comments/wx8ujr/best_linkedin_job_scraper_to_extract_job_postings/
8/25/2022 8:45:17 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Best LinkedIn Job Scraper to Extract Job Postings from LinkedIn
False
1
wx8ujr
0
32057
3
3
4
3.07692307692308
0
0
0
0
74
56.9230769230769
130
Red
10
Dash Dot Dot
20
No
915
Posted
6/15/2022 4:07:54 AM
Today Only: EXTRA 10% off Everything and more!
【Standard Year】Save $271 + FREE crawler + 1-on-1 training
【Professional Year】Save $800 + FREE crawler\*3 + 1-on-1 training\*3
👉 Click to check out the deals: [https://www.octoparse.com/summer-sale-2022/?utm\_source=redditfirst&utm\_medium=firstday0615&utm\_campaign=22summersale](https://www.octoparse.com/summer-sale-2022/?utm_source=redditfirst&utm_medium=firstday0615&utm_campaign=22summersale)
https://preview.redd.it/bogk31lcmp591.png?width=800&format=png&auto=webp&v=enabled&s=97bbdf91fabb6a6e984780abe81b905b0156d885
vclsgr
u_Octoparseideas
Octoparseideas
t3_vclsgr
https://www.reddit.com/r/u_Octoparseideas/comments/vclsgr/summer_sale_2022_starts_now/
6/15/2022 4:07:54 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
💥 Summer Sale 2022 Starts Now
False
1
vclsgr
0
32057
3
3
2
2.89855072463768
0
0
0
0
43
62.3188405797101
69
Red
10
Dash Dot Dot
20
No
914
Posted
9/1/2022 3:45:45 AM
In this article we will talk about:
The data you can extract from Upwork
The legality of scraping
The benefits of scraping to your business
A Step by step guide on how to use a web scraper—Octoparse, to scrape from Upwork
[https://www.octoparse.com/blog/upwork-web-scraping/?utm\_source=2022q3&utm\_medium=upwork-web-scraping&utm\_campaign=reddit](https://www.octoparse.com/blog/upwork-web-scraping/?utm_source=2022q3&utm_medium=upwork-web-scraping&utm_campaign=reddit)
x2xec3
u_Octoparseideas
Octoparseideas
t3_x2xec3
https://www.reddit.com/r/u_Octoparseideas/comments/x2xec3/how_to_scrape_upwork_for_talents_and_jobs/
9/1/2022 3:45:45 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Upwork for Talents and Jobs
False
1
x2xec3
0
32057
3
3
1
1.23456790123457
0
0
0
0
47
58.0246913580247
81
Red
10
Dash Dot Dot
20
No
913
Posted
9/1/2022 3:46:36 AM
In this article we will talk about:
The data you can extract from Upwork
The legality of scraping
The benefits of scraping to your business
A Step by step guide on how to use a web scraper—Octoparse, to scrape from Upwork
[https://www.octoparse.com/blog/upwork-web-scraping/?utm\_source=2022q3&utm\_medium=upwork-web-scraping&utm\_campaign=reddit](https://www.octoparse.com/blog/upwork-web-scraping/?utm_source=2022q3&utm_medium=upwork-web-scraping&utm_campaign=reddit)
x2xey3
Octoparse_ideas
Octoparseideas
t3_x2xey3
https://www.reddit.com/r/Octoparse_ideas/comments/x2xey3/how_to_scrape_upwork_for_talents_and_jobs/
9/1/2022 3:46:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to Scrape Upwork for Talents and Jobs
False
1
x2xey3
0
32057
3
3
1
1.23456790123457
0
0
0
0
47
58.0246913580247
81
Red
10
Dash Dot Dot
20
No
912
Posted
4/1/2022 1:54:36 AM
*Originally published as* [*https://www.octoparse.com/blog/proxy-server-for-web-scraping/?re=*](https://www.octoparse.com/blog/proxy-server-for-web-scraping/?re=) *on March 30, 2022.*
In recent years, big data has become the new gold and led the trends of data collection and data analysis. Web scraping or web data extraction has become a popular way for collecting web data. While being well recognized for its flexibility and adaptability, this new technology has helped many individuals and businesses to retrieve loads of data from nearly all websites or databases.
However, web scraping is not as welcome for website owners on the hand. It can increase heavy loads of traffic to the websites’ servers which can potentially crash the sites in the worst scenarios. As a result, with new technologies being developed for web scraping, the means of defense against it has become more sophisticated as well.
The most common way to fight back web scraping is to limit the access rate of any single IP. A web scraper that has made too many requests in a short period of time using a single IP address can be easily detected, and sooner or later get blocked by the target website. To reduce the chances of getting blocked, we should try to avoid scraping a website with a single IP address. The easiest way is to use proxy servers. In this article, we will introduce what is a proxy server and some popular web scrapers that have IP proxy features.
https://preview.redd.it/tsba21yyptq81.png?width=764&format=png&auto=webp&v=enabled&s=35aabaa3eb52a9dd21e80fe10d4209fbf0179bdf
# What is a proxy server
The word proxy means “to act on behalf of another,” and a proxy server acts on behalf of the user. When we browse a web page, a proxy is a system that provides a gateway between end-users and the web pages we visit online. Therefore, it helps prevent cyber attackers from entering a private network.
When a computer connects to the internet, it uses an IP address. This is similar to your home’s street address, telling incoming data where to go and marking outgoing data with a return address for other devices to authenticate. A proxy server is essentially a computer on the internet that has an IP address of its own. All requests to the Internet go to the proxy server first, which evaluates the request and forwards it to the Internet. Likewise, responses come back to the proxy server and then to the user. Therefore, proxy servers provide varying levels of functionality, security, and privacy depending on your use case, needs, or company policy.
# How does proxy server work for web scraping
As we mentioned above, websites usually block the IP addresses you use to access them. So using a proxy server is a good solution as the server has its own IP address and can protect yours. When using a proxy, the website you are making the request to no longer sees your IP address but the IP address of the proxy, giving you the ability to scrape the web anonymously.
Using a proxy pool allows you to scrape a website much more reliably and significantly reduce the chances that your crawlers will get banned. You need to build a proxy pool, which includes different proxy IP addresses to rotate. Integrate your proxy pool with your web scraping tool or script and you can get the web data under protection from blocking problems.
# Web scraping tools with proxy features
IP proxy works quite effectively for bypassing website blocks and an easy way to make use of IP proxy is to opt for web scraping tools that are already offering such proxy features, like [Octoparse](https://www.octoparse.com/). These tools can be deployed with the IP proxies at your disposal or with the IP proxy resources built into the specific tools.
It is always recommended to use a web scraping tool that runs with IP proxies when you need to scrape websites that use some kind of anti-scraping measures. Some popular scraper tools out there include [Octoparse,](https://www.octoparse.com/) Mozenda, Parsehub, and Screen Scraper.
## Octoparse
[**Octoparse**](https://www.octoparse.com/) is a powerful and free web scraping tool that can scrape almost all websites. Its cloud-based data extraction runs with a large pool of Cloud IP addresses which minimizes the chances of getting blocked and protects your local IP addresses. The newly released version, Octoparse 8.5, has [multiple country-based IP pools](https://helpcenter.octoparse.com/hc/en-us/articles/4417928258201-Switch-Cloud-IP-for-a-task-Version-8-5-) to choose from so you can effectively scrape websites that are only accessible to IPs of a specific region/country. With Octoparse, even when you run the crawler on your local device, you can still use a list of custom proxies to run the crawler to avoid revealing your real IP. (Here is a tutorial that introduces how to [set up proxies](https://helpcenter.octoparse.com/hc/en-us/articles/4408402059289-Set-up-IP-proxy) in Octoparse.)
## Mozenda
**Mozenda** is also an easy-to-use desktop data scraper. It offers geolocation proxies and custom proxies for users to choose from. Geolocation proxies allow you to route your crawler’s traffic through another part of the world so you can access region-specific information. When standard geolocation doesn’t meet your project requirements, you can connect to proxies from a third-party provider via custom proxies.
## Parsehub
**Parsehub** is an easy-to-learn, visual tool for gathering data from the web which also allows cloud scraping and IP rotation. After you enable IP rotation for your projects, proxies used to run your project come from many different countries. Additionally, you have the option to add your own list of custom proxies to ParseHub as part of the IP rotation feature if you would like to access a website from a particular country or if you would prefer to use your own proxies instead of the ones it provides for IP rotation.
## Apify
**Apify** is a web scraping and automation platform to collect data. It not only offers data collection service but also a proxy service reducing the blocking of your web scraping. Apify Proxy provides access to both residential and datacenter IP addresses. Datacenter IPs are fast and cheap but might be blocked by target websites. Residential IPs are more expensive and harder to block.
Now you should have a basic understanding of what a proxy server is and how it can be used for web scraping. Even though proxy makes web scraping more efficient, keeping the scraping speed under control and avoiding overloading your target websites is also important. Living in peace with websites and not breaking the balance will help you get the data continuously.
ttfgq0
u_Octoparseideas
Octoparseideas
t3_ttfgq0
https://www.reddit.com/r/u_Octoparseideas/comments/ttfgq0/use_proxy_server_for_web_scraping/
4/1/2022 1:54:36 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Use Proxy Server for Web Scraping
False
1
ttfgq0
0
32057
3
3
32
2.84951024042743
11
0.979519145146928
0
0
584
52.0035618878005
1123
Red
10
Dash Dot Dot
20
No
911
Posted
9/17/2021 3:25:32 AM
Businesses across the globe are on a constant hunt for talent. Often, this search for skilled candidates is full of unwanted bumps, lumps, and protuberance. To eliminate this friction recruitment executives turn towards job and career portals. As per [this](https://www.jobboarddoctor.com/wp-content/uploads/2016/05/2016-Job-Board-Survey-Final-Report.pdf) report, ***20% of the total hirings happen through job portals***. Also, 20% of the job portals are new entrants i.e., less than 2 years old.
With the rise of SaaS, PaaS, and other ready-to-market pre-packaged software tools, the entry barriers to this section of the recruitment industry have significantly reduced. But there has also been a spike in the number of job boards that fail within 3 years.
So, **how to build a successful job board website?** In this article, we explain how you can create a successful, profitable and sustainable ***“niche job board website”***.
# What Are Niche Job Boards?
Niche job boards are industry and/or location-specific job aggregator websites. They target a specific domain or industry vertical. In short, it’s a **segmented matchmaking platform for jobs & jobseekers, employees & employers, talent & opportunities from a particular industry or geography**. As per research, more than 40% of job boards are functioning with only 1–5 employees. It’s easy and affordable to start a niche job board.
# What Are The Secrets Of Building Successful Niche Job Boards?
[Healthcarejobsite](https://www.healthcarejobsite.com/), [efinancialcareers](https://www.efinancialcareers.com/), [allretailjobs](https://www.allretailjobs.com/) are some of the successful niche-focused websites for healthcare, finance, and retail industries respectively. There are thousands of specialized job boards tailored to specific industries and domains. But only a few are popular. We list some of the **strategies that have worked for successful Niche Job Boards** –
* Automated Job Aggregation using web scraping
* Extensive Database of job postings & candidates
* Focused Industrial Domain
* Focused Demography, Region
* Collaboration & *Relationship Building* with companies
* Multi-channel Monetization Models, Affiliate, Membership, Pay per click, Pay per post, Promotional posts
* Resume Building Templates For Candidates
* Efficient & Robust Matchmaking Algorithm
* Online Video Interviews & skill validation
* Review & Testimonials Feature For Funneling Right Candidates & Companies
* Salary Calculators & Career Growth Resources
# How To Build A Niche Job Board Website?
**Step 1:** Finding Your Niche For The Job Aggregator
**Step 2:** Decide On The Technological Stack Esp., **Job Scraping** Mechanism
**Step 3:** Build Your Job board In-House or Outsource
**Step 4:** Marketing & Launch
**Step 5:** Continuously **Aggregate Job postings**, Collaborate With Companies, Engage Candidates and Iterative Improve Your Platform
# How To Find The Right Niche For Your Job Board?
This is one of the critical decision-making steps for starting a job board. Factor in the following to choose the best-suited niche for your job board:
* **Your Expertise & Exposure To The Industry**
It can be very beneficial for an entrepreneur to start in the domain they have experience and expertise. You can cash in on your existing industrial contacts and associations.
For example, ReactJs is a highly popular front-end web development technology with 170k+ stars on GitHub. More and more companies are hunting for employees adept in ReactJs. Starting a React-focused niche job board could be rewarding. Who knows?
* **Industry, Demand & Market Trends**
You’ve to take into consideration future growth aspects of the industry in which you decide to startup your niche job board.
For example, Covid forced companies to go remote temporarily. This boosted the trend to go permanently remote. Companies will be looking to hire remote workers. This rising trend translates into higher demand for niche remote job sites.
* **Competitive Landscape**
Some of the industrial niches in the recruitment field are already very saturated. So, you as an entrepreneur evaluating niches for job boards must contemplate the competition. Based on the analysis, you may bring in attractive services, features, and business models to beat the competition.
Niche job boards can be segmented based on the hierarchy, seniority level, specialization, demography, etc.
* **Domain Specialization**: Every day several new startups are launched. Founders might not be always “jack of all trades”. Even if they are, businesses need specialized skills & experience in marketing, sales, tech, legal, etc., to propel on the growth roadmap. So, job boards can be specifically built to serve CMOs, CTOs, Directors, Legal advisors & Promoters.
[LucasGroup.com](http://lucasgroup.com/), [datajobs.com](http://datajobs.com/), [jobsfordevops.com](http://jobsfordevops.com/) are some of the high niche-focused dedicated tech job boards.
* **Level of skill**: The right mix of experience & skill is always necessary to balance the growth, cultural and financial wheels of any enterprise. You may consider focusing on junior-level executives, intermediate-level workers, or senior-level, high-caliber, high-performing employees.
[AllExecutiveJobs](https://www.allexecutivejobs.com/) is a niche website for senior-level professionals in Europe.
* **Demography & Region**: What suits Asians, may or may not suit Europeans or say Australians and Americans. The world changes every two KMs. We are different and so are our expectations from work. Niche Job boards can be specifically personalized for particular demography or say people in a city, country, continent.
[JobsInMilan.com](http://jobsinmilan.com/) is a geographical niche job website catering to employees in Milan city of Italy.
# How To Choose The Right Tech Stack For Your Job Board?
* **Web Scraping Job Postings** From Other Job Portals, Career Sites, and enterprise ATSs
You can either use **automated data extraction website crawlers like Octoparse** or you may do manual scraping to collect job postings from websites. Using automated solutions, you can scrape millions of postings daily and the solution is very scalable. It’s not possible to manually scrape that huge number iteratively daily.
* **Website Development —** Frontend, Backend, Database, Cloud Service
There are a plethora of options to choose from. You can consider javascript frameworks like Bootstrap, React, Angular, Vue, etc., for coding the frontend of your website. For backend, you may evaluate nodeJs, Django-python, java, python, PHP, etc.,
If you choose to go with a framework, you may consider WordPress, Drupal, Joomla for developing the website. But using these platforms can introduce several bottlenecks and lock-ins which are not good for building a scalable website.
For Cloud services i.e., hosting your website application, you may go with any of AWS, GCP, or Azure. Cloud delivers agility and cost-effectiveness.
[Keep reading: Use Octoparse for Web Scraping Job Postings At Scale](http://www.dataextraction.io/?p=1129/?red=)
ppscvt
Octoparse_ideas
Octoparseideas
t3_ppscvt
https://www.reddit.com/r/Octoparse_ideas/comments/ppscvt/how_to_develop_and_grow_your_niche_job_board/
9/17/2021 3:25:32 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How To Develop And Grow Your Niche Job Board Aggregator Websites?
False
1
ppscvt
0
32057
3
3
36
3.31491712707182
8
0.736648250460405
0
0
604
55.6169429097606
1086
Red
10
Dash Dot Dot
20
No
910
Posted
12/8/2021 1:52:05 AM
https://youtu.be/dxKTTKlBTQo
rbfi03
u_Octoparseideas
Octoparseideas
t3_rbfi03
https://www.reddit.com/r/u_Octoparseideas/comments/rbfi03/how_to_scrape_facebook_account_with_octoparse/
12/8/2021 1:52:05 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to scrape Facebook account with Octoparse
False
1
rbfi03
0
32057
3
3
128, 128, 128
3
Solid
50
No
161
Commented
3/29/2022 1:13:48 AM
Hi, please submit a ticket here and the customer service team will help you step by step:
https://helpcenter.octoparse.com/hc/en-us/requests/new
i2icmzf
Octoparse_ideas
Octoparseideas
t1_i2icmzf
https://www.reddit.com/r/Octoparse_ideas/comments/tm1d39/how_to_extract_links_with_octoparse/i2icmzf/
3/29/2022 1:13:48 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
tm1d39
t3_tm1d39
tm1d39
0
tm1d39
False
False
False
0
1
3
3
0
0
0
0
0
0
10
58.8235294117647
17
128, 128, 128
3
Solid
50
Yes
4
Commented
3/24/2022 8:54:32 AM
You can get basic information about each app calling api directly
[https://marketplace.zoom.us/api/v1/apps/filter?pageNum=1&pageSize=30](https://marketplace.zoom.us/api/v1/apps/filter?pageNum=1&pageSize=30)
Just change pagNum in the url for the next page
Edit: "id" is the app url
i1wppd6
webscraping
shalashaska02
t1_i1wppd6
https://www.reddit.com/r/webscraping/comments/tm1243/need_help_scraping_urls/i1wppd6/
3/24/2022 8:54:32 AM
3/24/2022 9:06:17 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
tm1243
t3_tm1243
tm1243
1
tm1243
False
False
False
0
1
3
3
0
0
0
0
0
0
33
64.7058823529412
51
128, 128, 128
3
Solid
50
Yes
3
RepliedTo
3/24/2022 11:43:02 AM
Thanks for this :)
i1x2zy9
webscraping
paglaindian
t1_i1x2zy9
https://www.reddit.com/r/webscraping/comments/tm1243/need_help_scraping_urls/i1x2zy9/
3/24/2022 11:43:02 AM
1/1/0001 12:00:00 AM
False
False
1
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
i1wppd6
t1_i1wppd6
i1wppd6
0
tm1243
True
False
False
1
1
3
3
0
0
0
0
0
0
1
33.3333333333333
3
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
162
Posted
3/24/2022 8:43:53 AM
https://www.reddit.com/r/webscraping/comments/tm1243/need_help_scraping_urls/
tm1d39
Octoparse_ideas
paglaindian
t3_tm1d39
https://www.reddit.com/r/Octoparse_ideas/comments/tm1d39/how_to_extract_links_with_octoparse/
3/24/2022 8:43:53 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
How to extract links with Octoparse
False
1
tm1d39
0
4
3
3
128, 128, 128
3.00283822138127
Solid
49.9878361940803
No
160
Posted
3/24/2022 8:21:01 AM
I'm trying to scrape this page: [https://marketplace.zoom.us/apps](https://marketplace.zoom.us/apps) to get app names and URL for each app page. I'm using Octoparse for this. While I get the app names and pagination really works well, I'm unable to extract the URL for each app.
I just cant find the URL in the page elements.Could someone please help.
An example: Kaltura has the app page URL:[https://marketplace.zoom.us/apps/VqdWYBqSRg-G6y4GTVMnCQ](https://marketplace.zoom.us/apps/VqdWYBqSRg-G6y4GTVMnCQ) which works anywhere in the flexbox
tm1243
webscraping
paglaindian
t3_tm1243
https://www.reddit.com/r/webscraping/comments/tm1243/need_help_scraping_urls/
3/24/2022 8:21:01 AM
1/1/0001 12:00:00 AM
False
False
2
0
Silver:0 Gold:0 Platinum:0 Count:0
False
False
Need help scraping URLS
False
1
tm1243
0
4
3
3
3
3.2258064516129
1
1.0752688172043
0
0
46
49.4623655913978
93