Wednesday, October 16, 2013

Books and Courses

A list of books and courses that I've found highly instructive.

Project Management
  • How to Run Successful Projects III - The Silver Bullet by Fergus O'Connell

Design

Programming Language


Largely inspired by this reading list on Agile.

Saturday, July 20, 2013

Sym Object Model

Might be a good idea to use the Javascript Object[1] for Sym's internal structure:
  • A JavaScript object is an unordered collection of properties.
  • Properties consist of a name and a value.
  • Objects can be declared using object literals.
  • Top-level variables are properties of window.
Imagine what it would be like if we can easily construct a Sym object simply using JSON notation.

Source:
  1. JQuery in Action

Sunday, June 30, 2013

Sym folders

Folder Structure

Sym
  \sym  : source code
  \test : testing code
  \docs : documentation

On Testing

Also like the idea of Thomas Andrews in [1] to put unit test code in the module itself:
if __name__ == '__main__':
   do tests...

References:

  1. http://guide.python-distribute.org/example.html
  2. http://stackoverflow.com/questions/61151/where-do-the-python-unit-tests-go

Add modules by specific path - e.g. for testing

import sys
sys.path.append('..')  # add path to module, in this case the parent directory
 

import some_module

Saturday, June 1, 2013

Python Tip - If you've accidentally deleted your .py files, all is not lost...

Recover .py files using uncompyle

If you've accidentally deleted your .py file, all is not lost. Use uncompyle to recover:

  git clone https://github.com/gstarnberger/uncompyle.git
  cd uncompyle/
  sudo ./setup.py install

  uncompyler.py thank_goodness_this_exists.pyc > recovered_file.py


See Also

Python Tip - Create and detach a child process

Create and detach a child process

Use the following code to create and subsequently detach a child process

  DETACHED_PROCESS = 0x00000008
  pid = subprocess.Popen([sys.executable, "longtask.py"],
                         creationflags=DETACHED_PROCESS)

  pid.poll()  # Returns: None if child still active
              #          returncode otherwise 


In Sym, a complex component will be made up of a parent process and related child processes. 

See Also


Wednesday, May 29, 2013

Using sym for notifications and predictions

As a follow-up on the sym workflow posted earlier,  here are some uses for sym.

Notification applications

The following will apply to any kind of notify-on-change kind of apps:
  1. Create sym check the price of gold continuously called say, goldpx
    1. Link goldpx to diffcheck
    2. Get sms notifications whenever the price of gold changes

Prediction applications

This is quite rough but does seem doable.
  1. Create a sym that's a neural network called say, nn
    • Set nn mode to train
    • Link goldpx to nn for a day or so. 
    • Set nn mode to predict 
    • Link nn to diffcheck
    • Get sms notifications whenever the neural network a change in the price of gold

Workflow to send SMS on new emails

Workflow diagram

Workflow components

There are four components:
  1. mailcheck - checks for unread emails
  2. diffcheck - checks for differences between current and previous list
  3. sms - sends sms
  4. sup - configures sms, diffcheck and mailcheck

Workflow flow

The workflow goes like this:
  1. mailcheck goes and checks for unread emails every n seconds, bundles it into a list and sends it to the input stream for diffcheck
  2. diffcheck compares the current list of emails with the previous list of emails, determines which ones are new, bundles it into a list and sends that off to sms
  3. sms looks at the list of incoming data and sends to phone numbers associated with diffcheck
    • but nothing really happens cos we've not associated the name 'diffcheck' with anything
    • and the incoming data is just discarded
  4. sup sends a command to sms, associating the name 'diffcheck' with a phone number
    • now sms is configured to send out messages associated with 'diffcheck'

Status

Have got a very rough working version.

Areas to improve

  1. diffcheck should just pass-thru instead of polling, the name should also just passthru
    • so instead of send_out() we could call something like pass_thru()

A good name and actionable readme on GitHub is important

A comparison of two projects

Compare jsonschema and json-schema-validator on GitHub for do's and don'ts e.g.:

Short, succinct package names

jsonschema is better than json-schema-validator

Well-thought, actionable readmes


json-schema validator readme

jsonschema readme

Tuesday, May 28, 2013

Useful Python Packages for Sym

List of useful Python Packages

  1. Regular Expressions -- re
    • Very powerful but takes a bit getting used to - Perl hackers should have no problems
  2. Parsing HTML -- lxml | lxml+xpath
    • Install from binaries for Windows platform 
    • lxml expects a clean xml, use regular expressions if your input aren't clean
  3. Logging -- logging  
  4. GUI -- Tkinter | WxPython | Other GUI Toolkits 
  5. JSON-driven GUI --  why? | pytkgen 
  6. JSON schema -- why? | jsonschema | examples | other implementations

Exception Handling Workflow

Exception Handling Workflow

The figure below shows my idea of handling exceptions, thanks to ideas by Rebecca J. Wirf-Brock[1].
Pardon the messy lines:


References:

  1. Toward Exception-Handling Best Practices and Patterns

Monday, May 27, 2013

edX courses of interest

An unsorted list of edX courses of interest:
  1. Globalization’s Winners and Losers: Challenges for Developed and Developing Countries
    • Starts 1 Oct 2013; 7 weeks, GeorgetownX
  2. Take Your Medicine - The Impact of Drug Development
    • Starts 16 Sep 2013, 13 weeks, UTAustinX
  3. Solar Energy
    • Starts 16 Sep 2013, 8 weeks, 8 hrs/wk, DelftX
    • Best student will be invited to the Delft University of Technology for a week
  4. Stat2.3x - Introduction to Statistics: Inference
    • Starts 31 May 2013, 4 weeks, UCBerkeleyX, Prereqs: Stat2.1x and Stat2.2x
  5. Stat2.2x -Introduction to Statistics: Probability
    • Starts 12 Apr 2013, 5 weeks, UCBerkeleyX, No Prereqs: Stat2.1x
  6. Stat2.1x - Introduction to Statistics: Descriptive Statistics
    • Starts 20 Feb 2013, 5 weeks, UCBerkeleyX, Prereqs: High School Arithmetic
  7. Shakespeare: On the Page and in Performance
    •  Starts Oct 2014, 4-5 hrs/wk, WellesleyX, No Prereqs
  8. Was Alexander Great?
    • Starts 27 Jan 2014, 4-6 hrs/wk, WellesleyX, No Prereqs
  9. Genomic Medicine Gets Personal
    • Starts 3 Apr 2014, GeorgetownX
  10. The Ancient Greek Hero
    • Starts 13 Mar 2013, 4-6 hours/week, HarvardX, No Prereqs
  11. Health in Numbers: Quantitative Methods in Clinical & Public Health Research
    1. Starts 15 Oct 2012, 13 weeks, 10hrs/week, Prereqs: Algebra
  12. CS169.1x - Software as a Service
    1. Starts 15 Mar 2013, 4 weeks, Prereqs: OO Programming
  13. CS169.2x - Software as a Service
    1.  Starts 15 Feb 2013, 6 weeks, 12hrs/wk, Prereqs: CS169.1x

Sym Overview with Twilio SMS integrated

A short overview of Sym with Twilio sms integrated.

Overview of Sym with Twilio sms Integrated

The moving parts:
  1. sym checks the cmd_queue every 5 seconds for new commands
  2. if an sms command is found, send it to sym_twilio which will then send an sms out
  3. everything that sym does is logged in sym.log
The top right hand corner is an example of what goes into the cmd_queue and how they end up at sym_twilio and sym.log respectively.

On Grok, Jeff Hawkin's machine intelligence


Recent Videos

Building brains to understand the world's data - Published Mar 10, 2013

Implementations

Open source implementation of Grok here*, - Neocortex_1.4.2c.zip last modified on Dec 14, 2008
  • See also the blog by Neocortex creator, zotric.

Books

There's an online version of the book "On Intelligence".



[*] Thanks to a tip by a fellow Courserian.

Saturday, May 25, 2013

Why build Sym?

Push vs Pull Computing

A typical email session involves checking our mail, figuring out which ones are urgent and important and responding to them. Generalizing that, here's what we do:
  1. scan for changes in the environment, 
  2. decide if any action is required, and
  3. respond accordingly.
Increasingly, the online environment is occupying more and more of our attention. The main reason for this is that our physical environment are increasingly being represented online. The result of all this is that we end up scanning our online environment more and more. Time better spent on leisure.

The above computing mode is what I call pull computing.

Now imagine a different computing mode, push computing, where a program exists that:
  1. automatically scans for changes in the environment 
  2. automatically decides if an action is required 
  3. either take that action on our behalf or prompt us to take action
This is not new - we already do this in a simple form. Consider what happens whenever we set an alarm to wake us up in the morning:
  1. clock automatically "scans" the time
  2. clock automatically checks if current time matched the alarm
  3. clock rings and prompts us to action
A non-trivial example is a simple program I wrote to monitor the water levels in the canals near my house to decide if I needed to evacuate during the devastating 2011 flood that inundated many parts of Bangkok. (was very fortunate that no evacuation was needed)

Why Sym? 

As in the physical world, we cannot be awake 24x7, Sym is a system that scans our the environment on our behalf and prompt us to take action when necessary.

We are then freed to think and to work on things that push the envelope.

Designs that lasts are based on scientific/mathematical principles

Designs That Lasts

An utterly fascinating Google TechTalk entitled, Building brains that understand the world's data, proposes machine intelligence on the basis of neuroscience principles.

Another example would be relational database systems. These are built on top of Relational Algebra principles.

Here a rough outline of how to apply this idea:
  1. Study what a system that works exceptionally well
  2. Extract the principles/theory underlying how the system works
  3. Apply those principles/theory to the system you are building
Would be really interesting to get a listing of all the systems--underlying principles currently in use.

Or perhaps, as a start, identify all the systems are are working exceptionally well, e.g. the brain, ecosystem, innovative companies; then extract the principles.

And a reverse way to apply this idea is to start with the principles and work out what we can build using those principles.

Related Talks

A related talk on how the brain works is, Think Faster, Focus Better and Remember More - Rewiring Our Brain To Stay Younger

How to get Sym to send SMS

The first thing that we need is for sym to communicate with us. A simple way is to send an SMS.

Send SMS via Twilio

And twilio.com offers a trial account to programmatically send an SMS.

Below is a python script that interfaces with the twilio api.
Usage:

# twilio_sms.py - a command-line tool to send SMS
# python twilio_sms.py  <recipient> <message>

# Get twilio-python library from http://twilio.com/docs/libraries
from twilio.rest import TwilioRestClient
import sys

def send_sms(number,msg):
# number:str    - Phone number that will receive the SMS
# msg:str        - Message content
    account_sid = "xxxxxxxx"
    auth_token = "yyyyyyyy"
    client = TwilioRestClient(account_sid, auth_token)
    message = client.sms.messages.create(to=number, from_="+1234567890",body=msg)

if __name__ == '__main__':
    [number,msg] = sys.argv[1:3]
    send_sms(number,msg)


Send SMS via Google Calendar

It is also possible to send an SMS using the Google Calendar API. The general idea is as follows:
  1. Create a Google calendar with Notifications enabled to send SMS zero minutes before the appointment
  2. Create a program to add events into that calendar
In effect, everytime an event is added, an SMS will be sent. I'll post the code for that in a later post.

Comparing the two options

All points in favour of Twilio Trial Account SMS:
  1. Twilio Trial SMSes are sent immediately, Google Calendar SMS needs a delay (30s-45s).
  2. Google Calendar SMS requires a one-time manual authentication, Twilio Trial SMS doesn't.
  3. Both options adds text around the actual sms text; but Twilio adds less:
  • Twilio Trial SMS Text: 
    • Sent from your Twilio trial account - <SMS Body>
  • Google Calendar SMS Text: 
    • Reminder: <SMS Body> @ Sat May 25, 2013 08:30 (_)

Thursday, May 23, 2013

Learning How To Mine Massive Datasets

There is an online book and online homework assessment system available for self-study of mining massive datasets.
Other Omnibus include:
  • Hopcroft-Motwani-Ullman Automata: 4A379A91
  • Garcia-Ullman-Widom or Ulllman-Widom Databases: E68759F1
  • Aho-Lam-Sethi-Ullman Compilers: 467454C2
  • ElMasri-Navathe Databases: 6F977376
  • Tenenbaum OS: 328E417C
  • Stallings OS: 72377233
  • Liang Java: D978043E
  • Rajaraman-Ullman Data Mining: 1EDD8A1D
  • Tan-Steinbach-Kumar Data Mining: 3426AAF1
  • Carrano Data Structures: D89F06AD
  • Aho-Ullman Foundations of CS: 8CD5ED01

Friday, May 17, 2013

A first iteration of sym.

The first iteration of sym runs in the background and does the following every 5 seconds:
  • checks if file, cmd.sym, exists, continues looping if it does not
  • if file contains "terminate" then sym stops running.
  • if file contains something else, sym just reads it and write it out to sym.log
# File: sym.py
# Author: hoekit [at] gmail [dot] com
from multiprocessing import Process
import os
import time

def sym_do(cmds):
    fp = open('sym.log','a')
    for x in cmds:
        if x.strip() == 'terminate':
            return False
        else:
            fp.write(x)
    return True

def sym_run():
    try:
        fname = 'cmd.sym'
        fp = open(fname,'r')
        commands = fp.readlines()
        fp.close()
        os.remove(fname)
        return sym_do(commands)
    except IOError:
        return True

def sym_loop():
    keep_running = True
    while keep_running == True:
        keep_running = sym_run()
        if keep_running == True:
            time.sleep(5)

if __name__ == '__main__':
    sym = Process(target=sym_loop)
    # sym.daemon = True
    sym.start()

These two links are great tutorials on how to spawn processes in Python:
* http://pymotw.com/2/multiprocessing/basics.html
* http://pymotw.com/2/multiprocessing/communication.html

In the beginning, there was sym.

For the computer to be doing things for us, there must be some kind of running program.
Let's call it sym - short for symbiotic computing.

So the first requirement for sym is that it is always running.
And we can ask sym to do things for us.

How? I don't know yet but we'll figure it out along the way.

A simple idea - get your computer to do things on your behalf as much as possible

The idea is fairly simple and not very new.
A computer can be always on. And with broadband, it can be always connected.

With the right set of programs running, your computer can do things on your behalf quite easily.
That's what this blog is about.

Getting your computer to do things on your behalf as much as possible.