dom - Capture PHP links without image links -


$url = 'http://www.test.com/'; $dom = new domdocument; @$dom->loadhtmlfile($url);  $links = $dom->getelementsbytagname('a'); foreach ($links $link) { 

i using above script capture links on page, found there duplicate links. on page, there picture linked, followed text link goes same link. there easy way capture text link, not image link?

as saying, might take approach of cleaning dupes in result set. not sure on scraping if link only used image?

you count occurrences.

$url = 'http://www.test.com/'; $dom = new domdocument; @$dom->loadhtmlfile($url);  $links = $dom->getelementsbytagname('a'); $distinctlinks = []; foreach ($links $link) {     $distinctlinks[$link] = (int) $distinctlinks[$link] + 1; } 

Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -