SCPD stream “re-encoding” mencoder to save space

October 15th, 2011 7 comments

The SCPD downloading script posted several years back seems to use some interesting mencoder encoder and so there always appears to be extra information packaged in the video file. Unfortunately, I needed to reduce the size of the streams and it appears that by using ffmpeg and extract the audio and video, and then muxing it back together seems to reduce the overhead file size by about 100 Mb.

I’ve included a simple bash script which you can modify below which takes a list of past streams to re-process (for windows you will need do figure out the dos equivalent). There doesn’t seem to be any loss in video quality and sound quality seems fine, so it seems like a free win.

Note: I have tried using x264 with high10 encoding, but it seems I cannot get the x264 10-bit compilation to produce proper ts video. The windows x264 10-bit from x264.nl seems to do a perfectly fine job, and ffmpeg in linux can then mux it with the audio, but using the linux compiled 10-bit, there seems to be problems with muxing it in ffmpeg, although the actual video only encode works as it should. If both can be used in conjuction, perhaps one could even attain a smaller file size by using 10-bit compression and the reduction in overhead explained here.

while [ -n "$1" ]
do
    echo "$1"
    ffmpeg -threads 0 -i $1 -vn -acodec copy $1.wma
    ffmpeg -threads 0 -i $1 -an -vcodec copy $1.wmv
    ffmpeg -threads 0 -async 1 -i $1.wma -i $1.wmv -vcodec copy $1.mkv
    rm -rf $1.wma $1.wmv
    shift
done

For those of you that interested in 10-bit encoding with the profile high10 in x264 here are some snippets:

# x264 --threads 0 --profile high10 -o $1.264 $1                                
# ffmpeg -threads 0 -i $1 -vn -acodec copy $1.wma                               
# ffmpeg -async 1 -threads 0 -i $1.264 -i $1.wma  -vcodec copy $1.final.mp4
Categories: Uncategorized Tags: , , ,

Ein (Einstein Index Notation) – A Domain Specific Language (DSL) Interpreter for Index Notation

January 11th, 2011 No comments

Today I have unveiled a small project I’ve been working on and off (mainly off) for the past 9 months. If you don’t care about the steps, but are interested in the implementation and current results of the language skip down to the Language Information section.

Ein Developement History:

The idea for creating a DSL mainly formulated back in April of 2010, when I participated in a small Scala DSL-based hackathon at Stanford. The language was called Liszt, after the music composer. It is currently being developed (hopefully) by some Ph.d. students and aims to create a DSL for the purpose of reusing as much code as possible when migrating from single-core to multi-core/multi-GPU-core/cluster architectures. The idea was to have the compiler figure out all the nifty multi-core/MPI/CUDA details and the coder could simply concentrate on writing some sort of “higher-level” code. As a person who deals with numerical computation in the general area of continuum mechanics , biomechanics, and finite elements, I jumped on board as an early test subject to see if I could leverage any of it to my advantage.

During the Liszt hackathon, some rules were enforced, such that we were not allowed to use iterators. (I assume the rules have something to do with the implementation of Liszt a year ago). This made multiplying matrices, and so forth much more difficult. In fact, writing a small finite element routine is quite tedious with iterators or iterable constructs (such as for-loops, and etc). Because of this restriction, I realized that even with Matlab, high-order tensors necessitated using for loops and there were no simple constructs for  handling operations such as double contractions in Matlab cleanly. (Yes, I do realize you could do something like sum(matrix1.*matrix2), but that has more to do with utilizing underlying knowledge of Matlab data structures than with Matlab being designed for index notation. )

At any rate, Einstein Index Notation, is something used quite commonly in the area of mechanics and is named after its “inventor”. It’s based on several important conceptual notations, but the practical reasoning behind the notation is in implied summations over repeated indices (aka. dummy indices). With the exception of the assembly operator and some other operators related to boundary conditions, finite elements are formulated and then coded directly using index notation.

After listening to several seminar lectures before April 2010 about Liszt, I became interested in Scala and used it to generate some object oriented meshes for a group of randomized finite element simulations for research. After my frustration with the initial version of Liszt, I explored the interwebz and found some tutorials/blogs about implementing internal and external DSLs in Scala. I started writing an indicial notation internal DSL, but realized that I actually needed to write a Abstract Syntax Tree (AST) for my language because of sub expression in parenthesis and such. I started writing my own, but realized that I didn’t really have a solid understand of Scala case classes, and Scala Parser combinators were quite daunting, as I had only used Scala in a functional programming paradigm for the aforementioned mesh generator. Luckily I found this project called Kiama, which “is a Scala library for language processing”. Luckily, I found an example called Imperative, which shows how to parse a simple mathematical language described by a custom AST. This gave me enough time and incentive to play around with Scala case classes and parser combinators.

After setting up and designing the AST for my language back in June, due to research deadlines, I went on a development haitus until Xmas, when I decided to take a break from research. Implementing index notation rules was more difficult that I imagined, since humans make lots of checks and can naturally translate notation to code.

The following is a list of things that had to be checked, or were things I never thought of as a mere human:

  1. Checking for repeated indices only show up twice.
  2. Checking Indicial ranges for possible IndexOutOfBoundException.
  3. Checking for mismatched or indices that would not result in the proper expressions
  4. Generating code from an AST
  5. Generating code optimizations due to symmetry of tensors (Voigt notation)

These things actually posed sufficient problems and resulted in a first attempt of 827 lines of Scala code. The code is currently somewhat messy, and a little hackish in terms of elegantly generating code and only generates C type languages (although it is designed to handle fortran, Matlab, and Python conventions). I will the language under the Simplified BSD license, although kiama is of course licensed under LGPL/GPL.

Language Information:

As noted before, Einstein Index Notation, is named after Albert Einstein. Since the first 3 letters of Einstein Index Notation are also the first three letteres of Einstein, the punny engineer in me has decided to name this tiny simple external Scala DSL as Ein.  Although the language does not account for contravariant and covariant indices and should therefore be named Index Notation (In), as “in”, is an English word, Ein seems like a better choice. (I apologize to Germans, since Ein is a German word, but since I’m the creator too bad.)

There are only a handful of statements that Ein currently supports:

  1. Set [T] = {dim1,dim2,…dimn}; : This defines the dimensions of tensor [T]. The length of the dimensions is its order.
  2. [A] = expr; : This assigns a tensor [T] to the given expression.
  3. env; : This displays the current information known by Ein about defined tensors.

Tensors are represented as follows:

identifier {indices} : identifier is the “name” of the Tensor. indicies are optional unless the expression necessitates indices (aka during Set statements they are not needed, nor for scalar tensors). Currently there are only two representations of indices and both only accept one character alphanumeric characters. Numbers should be between 1 and the specified or assumed dimension, if using repeated indices.

  1. Normal indices are just single characters and go between curly braces, i.e. T{i,i} for the trace of T.
  2. Voigt indices are specified within brackets, i.e. T{[i,i]} for the trace of T, where T is assumed by Ein to be symmetric. Ein does not check for symmetry.

Armed with this knowledge, one can start tapping out hopefully many Ein expressions. Below is a bit of code and some explanation about some generated test cases I have used.

Enter imperative language programs for parsing.
Ein> Set A = {3,3};
Size of A is List(3, 3)
Ein> p = A{i,i};
for (i=0;i<3+0;i++){ p +=A[i][i]; } Ein> Set x = {3};
Size of x is List(3)
Ein> b{i} = A{i,j}*x{j};
for (i=0;i<3+0;i++){
for (j=0;j<3+0;j++){
b[i] +=A[i][j]*x[j];
}
}
 
Ein> a{i} = b{i} + x{i};
for (i=0;i<3+0;i++){
a[i]+=b[i];
a[i]+=x[i];
}
 
Ein> d{i} = p*a{i} + A{i,k}*b{k};
for (i=0;i<3+0;i++){
d[i]+=p*a[i];
for (k=0;k<3+0;k++){
d[i] +=A[i][k]*b[k];
}
}

vg, vi, and vj are predefined matrices and arrays, that I have not explicitly printed. vg is a matrix where each element maps to it’s numbering in Voigt notation. While there are several different arbitrary ways of numbering elements using Voigt notation, I will use the following for a 3×3 matrix, T:

  • the diagonal is numbered 1,2,3
  • T(1,2) = 4, T(1,3) = 5, T(2,3) = 6, and corresponding symmetrical elements are also thusly defined.

vi and vj simply correspond to the mapping from Voigt index representation to the (i,j) representation of a matrix. (Rows, Columns).

  • vi = 1,2,3,1,1,2
  • vj = 1,2,3,2,3,3

The following examples employ Voigt optimizations:

Ein> Set F{} = {3,3};
Size of F is List(3, 3)
Ein> C{[i,k]} = F{j,i}*F{j,k};
for (i=0;i<6+0;i++){
for (j=0;j<3+0;j++){
C[i] +=F[j][vi[i]]*F[j][vj[i]];
}
}
 
Ein> CC = C{i,j}*C{[i,j]};
for (i=0;i<3+0;i++){
for (j=0;j<3+0;j++){
CC +=C[vg[i][j]]*C[vg[i][j]];
}
}
 
Ein> Csquared{[i,k]}=C{i,j}*C{j,k};
for (i=0;i<6+0;i++){
for (j=0;j<3+0;j++){
Csquared[i] +=C[vg[vi[i]][j]]*C[vg[j][vj[i]]];
}
}
 
Ein> Set kron = {[3,3]};
Size of kron is List(3, 3)
Ein> E{[i,j]}=0.5*(C{ij}-kron{ij});
for (i=0;i<6+0;i++){
E[i]+=0.500000*C[i];
E[i]-=0.500000*kron[i];
}
 
Ein> env;
Tensor	DimSize	ActualSize
*******************
CC	List(1)	List(1)
d	List(3)	List(3)
a	List(3)	List(3)
A	List(3, 3)	List(3, 3)
F	List(3, 3)	List(3, 3)
kron	List(3, 3)	List(6)
C	List(3, 3)	List(6)
p	List(1)	List(1)
b	List(3)	List(3)
Csquared	List(3, 3)	List(6)
x	List(3)	List(3)
E	List(3, 3)	List(6)

To test this yourself, I have included a compiled version of Ein and Kiama. The following line can be used to run the Ein Interpreter:

scala -classpath kiama_2.8.0-1.0.0.jar:ein110111.jar net.unoc.hawflakes.ein.Interpreter

Feedback

Feedback regarding bugs in index notation implementation, feature requests, and general constructive comments/questions are welcome. Code isn’t shown, since the code is a little messy and not as good as I ideally would like it. Please leave them in the moderate comments below.

VLC GOMTV SQLive URL extractor

October 25th, 2010 13 comments

Update 03/11 As far as I know GOMTV has started checking for VLC specific information, so the script probably won’t produce anything that works in VLC anymore. I don’t have time to write a Lua plugin for VLC, but that would be ideal to spoof as GOMPlayer.

Update 01/11/11: Updated user script to reflect Foo’s comment below. Code updated below.

Update 10/28/10: Fixed include of which webpages the greasemonkey script runs on. Before sometimes if the url did not match the one in the instructions it would not show the link.

I’ve have a lot of WIPs so I haven’t updated any of the content on this blog frequently. However, the hassle of watching GSL streams in P.S.T. have gotten to me, and at this point I’d rather sleep than watch Idra, Check, SlayersBoxer, Nada, Fruitdealer play Starcraft 2 live. The issue is that VODs are not free from GOMTV, and the pirated Youtube streams are a bit of a hassle to find. The massive viewers on restreamed livestreams easily reduces any computer and it’s network connection to dust.

One easy work around is to just watch the live streams from GomTV, since they are free even though the VODs are not. 720p HQ quality is offered at a premium, but SQLive is free and acceptable. However, since the hours are ridiculously harsh for people living on the West Coast of North America (3 a.m. to 6 a.m.) the best solution is to just record the live stream.

Luckily someone had figured out how to extract the SQLive stream link from the GOMTV pages with detailed instructions on reddit. However, the instructions are not particularly clear, and it seems GOMTV decided to change some of the URLs slightly. Last night I managed to verify that these instructions work with some modification and it seemed fairly reasonable to right a greasemonkey script, since most likely I will be extracting these links several times per day. Since this is tedious, and partially because I’m lazy, I decided to write a greasemonkey script to do extract the link while suffering from minor insomnia last night.

Usage:

  1. For GSL2, login and go to http://www.gomtv.net/2010gslopens2/live/.
  2. Link will show up in red next to “View Live” at the top of the page.

Greasyscript below: (or download vlcgomgreaser.user)

// ==UserScript==
// @name           VLCGOMGreaser
// @namespace      hawflakes.unoc.net
// @description    Get VLC link from GOMTV SQLive
// @include        http://www.gomtv.net/*/live/*
// ==/UserScript==
 
//
 
var script = findXPathNode("//div[@id=\"league_mainBody\"]/script");
var scriptcontent = script.innerHTML;
var start = scriptcontent.indexOf("goxUrl");
var start = scriptcontent.indexOf("http", start);
var url = scriptcontent.substring(start, scriptcontent.indexOf(";", start+1));
var url = url.replace(/strLevel=[^&]+/, "strLevel=SQTest");
var url = url.replace(/&title=[^;]+/, "");
 
var linkpage = get(url,"",showlink);
 
function showlink(client,info) {
	var contents = client.responseText;
	var start = contents.indexOf("LiveAddr=") + "LiveAddr=".length;
	var end = contents.indexOf(";\"/");
	var url = unescape(contents.substring(start,end));
 
	var insertion = findXPathNode("//*[@id=\"chmenu_title_container\"]");
	insertion.innerHTML += "<a href="\">"+url+"</a>";
}
 
function get(url, data, cb,info) {
	var client = new XMLHttpRequest();
	client.open("GET",url,true);
 
	client.onreadystatechange = function () {
		if(client.readyState==4) {
 
			cb(client,info);
 
		}
	}
	client.send(null);
}
 
function findXPathNode(xpath, start,doc)
{
	var result = (doc == null ? document : doc).evaluate(xpath,(start == null ? document : start), null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE ,null);
	return (result.snapshotLength &gt; 0 ? result.snapshotItem(0) : null);
}

Tecplot’s avi use broken windows codecs

April 29th, 2010 No comments

For those of you that use tecplot, you know that while the interface is fairly straightforward and the resulting visuals look pretty good, tecplot has one major flaw. The animations are generated in with essentially the MSRLE codec from a decade ago. The codec’s color palette generally cannot handle continuous color flooding  in tecplot; if you try to open it up on other machines or convert them to say flash video or mp4, you will find that the the colors are all messed up. Sometimes the frame size in tecplot causes the video to not display correctly.

So to fix this issue, I found out that the swf output actually works correctly. Modifying the previous pbox2avi script, I created a tecplot specific script to generate mpeg’s or whatever else ffmpeg can spit out. The script is pretty much a more rudimentary version of pbox2avi.py.

Requirements: swftools, ffmpeg

Download tecplot2mpeg.py.

As usual the code for the script is below:

# Copyright 2010 Jonathan Wong
# python script to convert SWF files generated from tecplot to mpeg
 
#requires swfextract
 
import commands,sys,re,math
 
def parse(filename):
#check filename for .swf extension
if not filename.find('.swf'): # not a comprehensive check
return
 
fileprefix = filename[0:filename.find('.swf')]
commands.getstatusoutput("rm %s.mpeg" % (fileprefix,))
 
#open file
status, output = commands.getstatusoutput("swfextract %s" % (filename,))
 
print output
 
#find "JPEGs: ID(s)"
slide_identifier = "PNGs: ID(s)"
start=output.find(slide_identifier)
slide_extract=[]
if start:
start += len(slide_identifier)+1
slide_end = output.find("[-f]",start)
print start,slide_end
slide_extract=output[start:slide_end-2].replace(" ","")
 
# now extract all the data
print "swfextract %s -P -p %s" % (filename,slide_extract)
 
slides = slide_extract.split(",")
digits = 1 + int(math.log10(int(slides[-1])))
 
for slide in slides:
print "Processing frame: %d" % (int(int(slide)/2)+1,)
print ("swfextract %%s -p %%s &amp; mv output.png %s%%0%dd.png" % (fileprefix,digits,)) % (filename,slide,int(int(slide)/2)+1)
status, output = commands.getstatusoutput(("swfextract %%s -p %%s &amp;&amp; mv output.png %s%%0%dd.png" % (fileprefix,digits,)) % (filename,slide,int(int(slide)/2)+1))
 
status, output = commands.getstatusoutput("ffmpeg -f image2 -i %s%%04d.png -b 600k %s.mpeg &amp;&amp; rm %s*.png" % (fileprefix,fileprefix,fileprefix))
return slide_extract
 
if __name__=="__main__":
slides=parse(sys.argv[1])
Categories: Useful Apps Tags: , , , , ,