pyspider踩坑—macos

环境:
macos:10.13.6 自带python2.7

1.pip安装过程

安装pip

sudo easy_install pip

无坑美滋滋

2.噩梦的开始--安装pyspider

sudo pip3 install pyspider

1.安装过程还行没啥大问题,补了两个依赖好像是six和nose
2.运行pyspider,失败了
3.错误:ImportError: pycurl: libcurl link-time ssl backend (openssl)
这里是pycurl的问题各种卸载重装依旧失败,各种教程花式失败,试了好久教程。直接哭了 :lei:
4.期间看了提示,貌似Python2.7 2020年失去支持,升级了3.7发现3.7不太兼容pyspider所以就升级了Python3.6然而问题没有解决

3.胜利的曙光

1.通过setup.py进行安装pycurl,首先跑github下载源码,然后直接利用setup.py安装,之后失败!!别急报了一个错误
clang error: 'src/docstrings.c' no such file
2.利用

python setup.py docstrings

直接解决,之后1234再来一次还是失败,错误是
src/pycurl.h:164:13: fatal error: 'openssl/ssl.h' file not found
3.由于mac好像没有openssl来着直接安装一个,利用

brew install openssl

安装之后最后的log给出了安装后的openssl路径,之后见证奇迹吧

python setup.py install --with-openssl --openssl-dir=/usr/local/Cellar/openssl/1.0.2s

哈哈哈,成功。。用此图纪念成功。。
pyspider

4.V2ex帖子和评论的爬虫(速度不能太快,不然秒封ip,第二天自动解,不要问我为什么知道)

#!/usr/bin/env python
# -*- encoding: utf-8 -*-
# Created on 2016-08-17 11:11:46
# Project: v2ex

from pyspider.libs.base_handler import *
import re
import random
import pymysql
import time


class Handler(BaseHandler):
    crawl_config = {
    }

    def __init__(self):
        self.db = pymysql.connect('localhost', 'root', '8', 'QA', charset='utf8')

    def add_question(self, title, content,comment_count):
        try:
            cursor = self.db.cursor()
            strftime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
            sql = "insert into question(title, content, user_id, created_date, comment_count) values ('%s','%s',%d, '%s', %d)" % (title, content, random.randint(50, 59), strftime,comment_count);
            print(sql)
            cursor.execute(sql)
            qid = cursor.lastrowid
            self.db.commit()
            return qid
        except:
            self.db.rollback()

    def add_comment(self, qid, comment):
        try:
            cursor = self.db.cursor()
            strftime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
            sql = "insert into comment(user_id,content,created_date,entity_id,entity_type,status) values (%d,'%s','%s',%d,%d,%d)" % (random.randint(50, 59), comment, strftime, qid, 1, 0);
            
            cursor.execute(sql)
            self.db.commit()
        except :

            self.db.rollback()
            
    @every(minutes=24 * 60)
    def on_start(self):
        self.crawl('https://v2ex.com', callback=self.index_page)

    @config(age=10 * 24 * 60 * 60)
    def index_page(self, response):
        for each in response.doc('a[href^="https://v2ex.com/?tab=tech"]').items():
            self.crawl(each.attr.href, callback=self.board_page)

 

    @config(priority=2)
    def board_page(self, response):
        for each in response.doc('a[href^="https://v2ex.com/t/"]').items():
            url = each.attr.href
            if url.find('#reply') > 0:
                url = url[0:url.find('#')]
            self.crawl(url, callback=self.detail_page)
        for each in response.doc('a.page_normal').items():
            self.crawl(each.attr.href, callback=self.board_page)

    @config(priority=20)
    def detail_page(self, response):
        title = response.doc('h1').text()
        content = response.doc('div.topic_content').html()
        items = response.doc('div.reply_content').items()
        qid = self.add_question(title, content,sum(1 for x in items))
        for each in response.doc('div.reply_content').items():
            comment = each.text()
            self.add_comment(qid,comment)
        
        return {
            'url': response.url,
            'title': title,
            'content': content,
            'comment': comment
        }
    
点赞
  1. MelvinTweni说道:

    An Extended Essay In Economics Digging A Little Deeper

    Write an engaging college essay to make your application stand out! She is a qualified academic and writer, and has been an editor for over ten years. Though each writer has his or her own style, you will have an essay prepared for you that is unique, along with it having its regular important components for each level. Sam Collier is a senior research writer and provide help for Definition essay and Writing definition free to contact for any sort of help in this regard.

    find out here now
    click this link here now
    why not try here

    7 Tips On Writing An Effective Essay

    There are four main types of essays: narrative, descriptive, expository, and argumentative. The writing team with us are British essay writers based locally and are fully equipped to support you via email, phone or if needed in person too. If you've signed up for a high school course in English literature, chances are great that you'll be assigned Beowulf to read, and you might very well have to write an essay or research paper based on the work.

    Writing Articles From AMAZINES.COM
    Introduction, Types Of Essays, Tips For Essay Writing, Questions
    What Motivates You?
    b316174

  2. Marktwisk说道:

    doxycycline 75 mg capsules lisinopril 20mg tablets suhagra 50 mg tablet cheap generic cipro without prescription antabuse 125 mg atarax generic lipitor generic

  3. Yontwisk说道:

    medication atenolol 50 mg buspar medication buy tadalafil online buy prednisone online canadian pharmacy tadalafil 20mg buy generic seroquel online retin a 0.05 cream cost indocin tablets motilium tablets over the counter viagra soft tabs uk

  4. Kimtwisk说道:

    ventolin nebulizer

  5. Joetwisk说道:

    indocin 50 mg lisinopril 20 mg valtrex 500mg price canada advair diskus 500 50 mg

  6. Zaktwisk说道:

    tadalafil 2.5 mg tablets buspar 555 azithromycin 500 mg tablet cialis 5mg coupon motilium price

  7. Kiatwisk说道:

    can you buy atarax over the counter

  8. Paultwisk说道:

    accutane usa where can i buy atarax in uk buy generic tadalafil online buy ventolin online cheap tretinoin cream buy online australia

  9. Jacktwisk说道:

    strattera 25 mg compare ventolin prices

  10. Marytwisk说道:

    buy propecia online prescription cheap suhagra buy diflucan retin a 0.025 buy metformin canada acyclovir 400 mg buy generic kamagra where can i buy tretinoin buy lisinopril doxycycline hyc 100mg

  11. Zaktwisk说道:

    tadalafil online canada viagra 50mg pill cheapest propecia uk

  12. Lisatwisk说道:

    generic for atarax

  13. Jimtwisk说道:

    buy wellbutrin can i buy ventolin over the counter in usa celebrex 200mg price buy diflucan metformin 500 eu kamagra

  14. Paultwisk说道:

    ventolin inhaler antabuse for sale online amoxicillin tablets for sale celebrex generic clomid rx online

  15. Eyetwisk说道:

    suhagra 100mg buy can you buy advair over the counter in mexico tretinoin cream buy online australia metformin 500 mg without prescription how to buy propecia atarax 25 generic for indocin amoxicillin 500 mg mexico price of zoloft buy retin a generic synthroid online lipitor 20 mg generic where can i buy acyclovir 400mg tablets valtrex rx buy kamagra online usa strattera order online buy roaccutane proventil hfa 90 mcg inhaler albuterol for sale buy diflucan

  16. Yontwisk说道:

    tadalafil 20mg india indocin for sale sildenafil over the counter nz buspar 15 mg tretinoin cream buy prednisone buy tadalafil online viagra soft tabs propecia prescription cost uk purchase lisinopril 40 mg

  17. Jacktwisk说道:

    lexapro 10 strattera generic

  18. Marktwisk说道:

    accutane in mexico antabuse pills proair albuterol inhaler doxycycline hyc where to buy retin a tretinoin

  19. Tedtwisk说道:

    cheap viagra soft tabs viagra pills online purchase tadalafil 20mg mexico atenolol lisinopril 20 mg azithromycin 500 mg tablet sildenafil 50mg coupon cialis drug prices

  20. Teotwisk说道:

    prednisone 20 mg tablet price

发表评论

电子邮件地址不会被公开。必填项已用 * 标注