Scraping Yahoo Serp “Ultrabook Notebook Tipis Harga Murah Terbaik”


scraping yahoo serp “Ultrabook Notebook Tipis Harga Murah Terbaik”.

script ini digunakan untuk menscraping serp yahoo..

langsung saja kita coba script dibawah ini..

Scraper code example:

<?php

function getPage($proxy, $url, $referer, $agent, $header, $timeout) {

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, $header);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_ENCODING, ‘gzip,deflate’);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);

//curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
//curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie);
//curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);

$result[‘EXE’] = curl_exec($ch);
$result[‘INF’] = curl_getinfo($ch);
$result[‘ERR’] = curl_error($ch);

curl_close($ch);

return $result;
}

/* SCRAPING YAHOO */
$result = getPage(
‘[proxy IP]:[port]’, // get a proxy from somewhere
http://search.yahoo.com/search?p=Ultrabook+Notebook Tipis+Harga+Murah+Terbaik‘,
http://www.yahoo.com/&#8217;,
‘Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8’,
1,
5);

if (empty($result[‘ERR’])) {
preg_match_all(‘(<h3><a class.*href=”(.*)”.*>(.*)</a>)siU’,
$result[‘EXE’], $matches);

for ($i = 0; $i < count($matches[1]); $i++) {
// decode url
$matches[1][$i] = urldecode($matches[1][$i]);
// get rid of rds.yahoo.com redirect
preg_match_all(‘/\*\*(http:\/\/.*$)/siU’, $matches[1][$i], $urls);
$matches[1][$i] = $urls[1][0];
//echo “<a href='”.$matches[1][$i].”‘>”.strip_tags($matches[2][$i]).”</a><br>”;
}

// strip tags

//for ($i = 0; $i < count($matches[2]); $i++) {
// $matches[2][$i] = strip_tags($matches[2][$i]);
// echo “<a href='”.$matches[1][$i].”‘>”.$matches[2][$i].”</a><br>”;
//}

// Job’s done!
// $matches[1] contains URLs
// $matches[2] contains anchors
// …
} else {
//echo $result[‘ERR’];
}

/* END SCRAPING YAHOO */

//echo $result[‘ERR’];
//echo $result[‘INF’];
echo $result[‘EXE’];

?>

source code sumber:http://www.fromzerotoseo.com/scraping-yahoo-serp/

Selamat mencoba.

salam,

surya wijaya
newbie programmer

Iklan

Tinggalkan Balasan

Please log in using one of these methods to post your comment:

Logo WordPress.com

You are commenting using your WordPress.com account. Logout / Ubah )

Gambar Twitter

You are commenting using your Twitter account. Logout / Ubah )

Foto Facebook

You are commenting using your Facebook account. Logout / Ubah )

Foto Google+

You are commenting using your Google+ account. Logout / Ubah )

Connecting to %s