Scrapy – Call function when spider closed

Hello Wednesday,

Today at work I had the chance to play with Scrapy. It is quite fast and really easy to use.
I will write posts about this crawling framework later, but for the time being I will take a quick note about how to call a function when spider closed.

After you crawl all the items and wanna do something else. Simply use the closed function.

# -*- coding: utf-8 -*-
import scrapy

class TcvStockSpider(scrapy.Spider):
	def closed(self, reason):
           #do something

In their document, there is another way to achieve this but I myself think this is simple and easy to achieve way.

retry failed kue in nodejs

Hello there, today I’m gonna write about how to clear kue and retry failed kue.

Kue is a priority job queue backed by redis, built for node.js. Sometimes, our jobs will failed, that is when they are marked as a failure, and remain that way until you intervene. In this case you will want to remove or re-attempt them.

First of all you need to find all those failed jobs:

kue.Job.rangeByState( 'failed', 0, n, 'asc', function( err, jobs ) {
// you have an array of maximum n Job objects here

Next, if you want to remove those jobs from, simply throw this code:

async.eachSeries(jobs, function(job, cb) {
}, cb);

In case you want to retry them, change their state from
failed to inactive:

async.eachSeries(jobs, function(job, cb) {
}, cb);

To sum up, the whole function to retry fail jobs is:

var retryFailedKue = function(cb) {
    var n = 2000;
    var kue = require('kue');
        function (cb) {
            kue.Job.rangeByState('failed', 0, n, 'asc', cb);
        function (jobs, cb) {
            async.eachSeries(jobs, function(job, cb) {	
            }, cb);
        function (err) {
            if (err) console.log(err);

Git – Untrack pyc files from source control

Why do we need to do this? What is a pyc file?
Python automatically compiles script to compiled code before execute it. Doing this will help your script run more smooth. And because this is automatically generated files, there is no use to commit a pyc file to your project’s source control.

$ find . -name '*.pyc' | xargs -n 1 git rm --cached

Beautify URL – Filter values before submitting Form

Case: Say you are using pjax and your site’s search page has a form with 3 inputs (A, B and C), when user only changes value of filter C, and submit the form, your site’s URL will transform into something like:

Problem: Ugly URL

Solution: Filter values before submitting form

Predicted result:


$formSearch.submit(function(event) {
    var $form = $(this);
    var options = {}; = $form.find(":input").filter(function() {
        return $(this).val() !== '';
    $.pjax.submit(event, '#pjax-container', options);