Jon’s Project Blog

SCPD Class Link Extractor

May 9th, 2009 Jon 25 comments

Updated 1/13/11: Turns out, I had corrected the script sometime earlier in 2010, but had thought I had already uploaded it, so my script looked pretty similar to Joey’s. I just looked here, and it turns out the script is not the same as the old one. I apologize for the confusion.

At the end of every quarter I use the following greasemonkey script to extract the links of each of the classes.

Usage: Go to the current quarter SCPD page and click on each of the links. Extracting the links takes awhile, since it is based on the previous SCPD link extractor. The GM script simply goes to each lecture page and extracts the link. After extraction is finish a new tab will appear and will display all the links for you to copy and paste. (This uses the same idea as the Xanga2RSS extractor.)

It seems that the SCPD program keeps the lectures archived at these links for a pretty long time. At least for a quarter or two. (However I once googled for SCPD links and found an EE one from 07 and it still worked, so perhaps SCPD never really erases them.)

Here it is below:

// ==UserScript==
// @name           SCPD links
// @namespace      hawflakes
// @description    Strips video links off of SCPD
// @include        https://myvideosu.stanford.edu/*
// ==/UserScript==
 
var vidurls="";
var links;
 
// Main link page for a course
if (window.location.toString().indexOf("GradCourseInfo.aspx?")>=0)
{
// Get links to video page
 
links = findXPathNodes("/html/body//table/tbody/tr/td/a[text()='WMP']");
i=0;
 
var href=links.snapshotItem(i).getAttribute("href");
var url = href.substring(href.indexOf("'")+1);
url = url.substring(0,url.length-4);
get(url,"",getlink,i);
 
}
 
// helper functions
function findXPathNode(xpath, start,doc)
{
	var result = (doc == null ? document : doc).evaluate(xpath,(start == null ? document : start), null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE ,null);
	return (result.snapshotLength > 0 ? result.snapshotItem(0) : null);
}
 
function findXPathNodes(xpath, start,doc)
{
	return (doc == null ? document : doc).evaluate(xpath,(start == null ? document : start), null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE ,null);	
}
 
function elem(tagname,content)
{
	var ret = document.createElement(tagname);
	ret.innerHTML = content;
	return ret;
}
 
function get(url, data, cb,info) {
	var client = new XMLHttpRequest();
	client.open("GET",url,true);
 
	client.onreadystatechange = function () {
		if(client.readyState==4) {
 
			cb(client,info);
 
		}
	}
	client.send(null);  
}
 
function getlink(client,info)
{
 
 
	var tempurl=client.responseText.substring(client.responseText.indexOf("data=\"")+"data=\"".length,client.responseText.indexOf(".wmv\"")+4)+"\n";
	alert(tempurl);
 
	vidurls=vidurls+tempurl;
	GM_log(tempurl);
if (info<links .snapshotLength-1)
{
	var href=links.snapshotItem(info+1).getAttribute("href");
	var url = href.substring(href.indexOf("'")+1);
	url = url.substring(0,url.length-4);
	get(url,"",getlink,info+1);
}
else
{
 //Generate links page
GM_openInTab("data:text;charset=UTF-8," + encodeURI(vidurls));
}
 
}

Categories: Uncategorized Tags: extractor, links, SCPD

Downtime…

April 25th, 2009 Jon No comments

Just got this blog back up and running. My host migrated to a better server and I had to recompile php and the works, since this runs on fastcgi. Anyhow, apparently I had compiled php with ipv6 enabled and wordpress kept on farting out the error message “error establishing database connection”, even though I had logged into the sql server via commandline multiple times and verified that everything was there.

After an hour of searching, I found the answer on the wordpress forums. The gist of it is, that the DB_HOST variable in wp-config.php had to be changed to 127.0.0.1. After this correction, the blog magically worked again.

Note, I am not leaving you dry. I’ve currently been working on a simple deluge plugin for personal use which I may release, as well as a simple MathML to latex converter using my MathML-in-gwt kit (GWT converter). Currently it can spit out a decent amount of latex, but I am still need to automate the compatibility of the symbols and such.

Categories: Uncategorized Tags: error estalishing database connection, ipv6 problems, localhost, wordpress

UCLA Bruincast greasemonkey link ripper

March 25th, 2009 Jon 3 comments

Just a quick post on the simple greasemonkey script that rips UCLA Bruincast video links (obviously you need access). I made it for my brother, so he can download lectures for classes he would like to save. Here’s the short greasy script. (Remember, rename the urls from “http:” to “rtsp:”). To grab the DSL links simply replace ‘LAN’ with the text of the link. Enjoy!

// ==UserScript==
// @name           BruinCastPageLinks
// @namespace      hawflakes.unoc.net
// @description    Grabs Links on a page
// @include        http://www.oid.ucla.edu/webcasts/courses/*
// ==/UserScript==
 
// Greasemonkey Script written by Jonathan Wong, Copyright 2009
// Script is freeware (Use at your own risk!).
 
Links = findXPathNodes("//a[text()='LAN']/@href");
PrintString="";
 
for(i=0;i<links .snapshotLength;i++)
{
	PrintString=PrintString+Links.snapshotItem(i).value+"\n";
}
 
GM_openInTab("data:text;charset=UTF-8," + encodeURI(PrintString).replace(/&nbsp;/g,"&amp;nbsp;"));
 
function findXPathNode(xpath, start,doc)
{
	var result = (doc == null ? document : doc).evaluate(xpath,(start == null ? document : start), null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE ,null);
	return (result.snapshotLength > 0 ? result.snapshotItem(0) : null);
}
 
function findXPathNodes(xpath, start,doc)
{
	return (doc == null ? document : doc).evaluate(xpath,(start == null ? document : start), null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE ,null);	
}
 
function elem(tagname,content)
{
	var ret = document.createElement(tagname);
	ret.innerHTML = content;
	return ret;
}
</links>

Categories: Education, Useful Apps Tags: greasemonkey, UCLA

UCLA streams – Real Media (.rm)

March 25th, 2009 Jon No comments

So apparently mencoder only seems to work for asf,wmv,avi,mpeg streams, and does not like real media streams (I think this is purely due to the fact I do not have real media codecs on this machine and therefore -ovc/-oac options would not work anyway). So I’ve defaulted back to using mplayer for ripping the real media streams. Fortunately, I had remembered seeing this, which pretty much explains what one needs to do.

For UCLA, the linked .rm file is actually just plain text that gives you the actual url. However, in the case of UCLA (like stanford), one can simply replace “http” with “rtsp”.

mplayer -noframedrop -dumpfile out.rm -dumpstream rtsp://url.rm

The most interesting piece of information is that on windows, it is possible for mplayer to rip streams even if it can’t actually play it. For example, my Windows machine (what I’m using atm) does not have real player or alternative/hacked rm codecs on it, and therefore I can’t actually play .rm. However, I was able to rip it and someone else was able to play it. This does make sense.

Categories: Education Tags: .rm, mencoder, mplayer, Real media, rip, stream, UCLA

Newer Entries Older Entries

Jon’s Project Blog

SCPD Class Link Extractor

Downtime…

UCLA Bruincast greasemonkey link ripper

UCLA streams – Real Media (.rm)

Random Posts

Categories

Blogroll

Archives

Meta

Jon’s Project Blog

SCPD Class Link Extractor

Downtime…

UCLA Bruincast greasemonkey link ripper

UCLA streams – Real Media (.rm)

Random Posts

Tag Cloud

Categories

Blogroll

Archives

Meta