Archive for the ‘Language’ Category

Fusible Traces

Well, finally getting back to some (or a) posting after some time. Fortunately or unfortunately depending on how you look at it I was heads down in project work at my last semester at GA Tech. Now, I have a few brief moments between ramping up on a new gig and the moving thing (back to SJ, once again). Figured I would take some time to convert over an overview of a fun project I did along with Shauvik Roy Choudhary and Jared Klumpp, two talented CS students here at Tech and authors of the content below. It was part of Nate Clark’s CS8803-DynamicCompilation course, which I highly recommend for anyone reading this from Tech. The class ROCKED, had quite the cast of characters in there with strong backgrounds, but unfortunately I missed the Tab tasting day, but wtf drinks Tab anyway 😉 Ok, thats a few inside remarks, but for anyone interested in VMs and Tracing read on. The idea is to improve tracing results by fusing loops together prior to any recording. The project was in large part driven by the “Trace-based just-in-time type specialization for dynamic languages” PLDI 09 paper, and thanks out to initial ideas from guys back at Adobe and Mozilla IRC channel.

1. Introduction

Trace compilation represents a different paradigm than traditional just-in-time (JIT) compilation techniques. Rather than adhering to principles derived from traditional static compilers such as whole method compilation, trace compilation relies on runtime profiling information, making it unique to the domain of dynamic compilation. With trace compilation program execution starts in the interpreter. While the interpreter executes bytecodes, the VM maintains profiling information to identify frequently executed (hot) loops within the program. In a trace-based approach it is assumed that backward branches represent a control flow transfer from the end of the loop to the beginning of the loop. When such branches becomes hot, i.e., when the execution count crosses a certain threshold, traces are recorded and compiled. The benefits of tracing include the ability to easily inline procedure calls, simplified on-stack replacement, and the compilation of only relevant code areas (cold code is ignored).

Trace compilation was a major component of the JIT compiler provided in the FireFox 3.1 browser [2]. The trace based JIT, TraceMonkey, used trace compilation as a means of asserting type stability within a loop. Unlike method-based dynamic compilers, TraceMonkey operates at the granularity of individual loops. This design choice was based on the expectation that programs spent most of their time in hot loops. Furthermore, we can expect hot loops to be mostly type-stable, meaning that the types of values are invariant. TraceMonkey relies on type stability to generate type specialized code during tracing, obtaining an overall 10x speedup in performance [4].

The objective of this project, JSS, is to facilitate Trace-Monkey’s use of Trace compilation by increasing the scope of what gets included in a trace. More specifically we look at applying loop-fusion as a means of effectively fusing what would otherwise be separate traces, creating larger traces overall. The details of our implementation are discussed in section 3, followed by the results in section 4. Fusing loops before the tracer adds difficulties because parsing and compilation into a bytecode representation within TraceMonkey occurs in a single pass. These constraints and their potential workarounds are discussed in section 5.

2. Related Work

There has been much significant work in the area of Tracing as of recent. In particular, the benefits of trace over whole method compilation are still being debated and evaluated. Accordingly, the two areas of related work discussed in this section are investigations into whole method compilation and expansions of prior work in trace compilation for dynamic languages.We discuss both below and present several other more loosely related topics.

As mentioned above, the debate of Whole method vs. Trace compilation is still ongoing. Much of the debate has spurred by the fact that Google’s V8 [5] VM utilizes a whole method compilation technique as does a new research project at Mozilla, J¨agerMonkey [9]. Google’s V8 performs whole method compilation prior to execution in order to gain native-execution performance gains. The major benefit of performing whole method compilation as opposed to tracebased compilation is the increase in granularity for optimization, which would allow for loop optimizations in the native code. In addition to basic JIT compilation, V8 performs additional optimizations on JavaScript. For instance to improve the speed of property access, V8 uses hidden classes in the background to establish commonalities between objects. This allows traditional class based optimizations such as inline caching to be applicable to JavaScript objects.

At the same time there is possible work going forward to improve the usage of trace complication in the context of dynamic languages. Andreas Gal has discussed extending TraceMonkey’s current implementation with parallel dynamic compilation and recompilation of traces, the folding of trees with minimal differences, more precise load propagation within a tree, and improved register allocation [3].

Other related efforts include the more general use of loop-fusion and type specialization. Static compilers use loop-fusion to improve data locality and reduce the number of branches and overhead when loops have similar properties[7]. Loop-fusion also brings opportunities for parallelism and later optimizations. These additional optimizations could potentially increase the performance of the code but come at the cost of complexity of analysis and restructuring. However, the run-time goals of JavaScript preclude overly complex optimizations that may not pay off in execution. There has also been work in type specialization using pure static analysis. TAJS [6] uses abstract interpretation to infer sound types that can be used to detect common programming errors and assist program comprehension. Similarly, RATS [8] is focussed on aggressive type specialization that finds accurate information about the variable interval (range), kind (integer, float or NaN) and also relates it’s value. In their study, they report considerable improvements on numerical benchmarks.

In this paper, we improve the trace based type specialization in TraceMonkey by performing loop-fusion on the program’s AST. Hence, this augmented technique leverages both static as well as dynamic information of the program, thereby harnessing the benefits of both.

3. Implementation

JSS seeks to broaden the granularity at which tracing occurs. Increasing the vision of the tracer through loop-fusion reduces the number of loops, and inherently branches, as well as increases the opportunity for optimizations within the loops. JSS therefore performs loop-fusion during parse time in order to occur before the tracer and avoid adding additional passes, which would decrease overall performance.

Figure1: Tool implementation within SpiderMonkey

The TraceMonkey parser scans and parses one statement at a time. After a statement has been parsed, the generated AST is emitted to a stack-based intermediate representation and parsing begins for the next statement. Whenever loops are emitted to IR, additional instrumentation is inserted to control the tracer and count the number of trips. After the file has been parsed and emitted, the interpreter begins executing at the top of the IR. When the trace instructions have been executed a specified number of times, execution pauses while the tracer consumes the IR and emits native code for that loop. Parsing also performs rudimentary data-flow analysis for constant propagation and folding. JSS uses this information to characterize loops at parse time and identify data-flow dependencies. Only loops within a higher-level statement construct, such as a function, may be fused. This occurs because of the self-imposed restriction that loop-fusion occurs within parse time and characteristic of the parser to emit the IR on a per-statement basis (as opposed to emitting after the entire program has been parsed). First the AST nodes were augmented with an additional data structure, LoopInfo, which characterizes the loop. The characteristics required for fusion are the loops bounds, direction and stride. Identifying these features requires the use of usedef chains created during parsing. Very basic induction variable identification assumes that the induction variable must occur in the condition of the loop, for the proof-of-concept. The induction variable is then traced back to its definition through the use-def chains to identify one of the constant bounds. The other side of the conditional expression is then traced through the use-def information to identify the other constant bound. If either of these bounds results in an expression or global variable, then the array bounds cannot be safely determined and loop-fusion is not possible. Both the loop body and header (consisting of the conditional) are examined to identify the stride. The stride must be a monotonic, constant change, and only basic induction variables are supported (such as d = d +/- C). Dependency identification at parse time proved more challenging, because the ASTs lack features that would greatly ease their identification. Were these features present, they would still be undesirable because they would essentially add several passes of analysis for each loop. Instead JSS identifies dependencies after the loop has been completely parsed, using the aforementioned use-def chains and list of generated variables. Because the variables used by a loop cannot be propagated to the top-level AST node without augmenting every aspect of the parser, the list of all variables in the program serves as the mechanism for identifying dependencies. After loops are parsed the list of variables is scanned and each variable’s use-def set examined for a use or definition occurring within the loop. If this occurs then the loop will be conservatively marked as carrying a dependency on that name, although this could be improved to more aggressively remove dependencies. The code lying between two loops a and b can also create potential dependencies. If the definition of a variable, x1 used in loop b, depends upon a variable x2 used or defined in loop a, then a dependency is formed from b to a. This is the only dependency-creating condition, but additional source transformations must occur for code lying between the two loops:

  1. The intermediate code contains the use of a variable defined or used in a. In this situation the code can be sunk below b.
  2. The definition of a variable that is used by b occurs in the intermediate code. In this situation the definition can be
    hoisted above a.

The appropriate actions for each of the above two situations occurs for all intermediate code between a and b. Any remaining code carries no dependence on the loops and can arbitrarily be sunk or hoisted depending upon other data-flow dependencies. The final dependency causing condition occurs because of global variables and function calls. Loops are dependent upon any global variable defined or accessed within the loop, and if the use or definition of a global variable occurs between the two loops then they cannot be merged, because the effects of globals are difficult to determine. This is discussed in further detail in section 5, Problems. Function calls cause dependencies from the loop to all global variables, because the effects of a call cannot be determined at parse time. Two loops a and b are then considered for fusing after the identification of loop-characteristics and dependencies. First the dependencies are checked to ensure that there is no dependency from b on a and that the intermediate code between a and b does not carry a use of a dependency from a in the same statement as the definition or use of a dependency from b. If the two loops pass the dependency analysis then they can be transformed and merged. The merging process first creates a pre-header for both loops and moves the initialization variables (for for-loops) into this pre-header. The update-component is then appended to the end of the loop body for for-loops. Then, the trip count for both loops is calculated. If both loops execute the same number of times then the bodies are fused and the condition for the fused loop comes from the first loop. Otherwise, the loop with the smaller trip count has a guard with its current condition inserted around the body, the two bodies are fused, and the condition for the fused loop is created using the condition condition from the larger loop. This is clearly an opportunity for loop-peeling and is discussed in section 6, Future Work.

TraceVis1 was used to profile TraceMonkey and determine how much time the tracer was spending in each phase of execution. TraceMonkey emits timing events to a log events whenever the compiler phase changes, which the TraceVis tool can later analyze. The post-execution analysis reduces the overhead and provides more accurate timing results. The two TraceVis tools used were Vis, a visualization creating tool, and binlog, a text-based analysis tool that was used to quantify times spent in each phase. These tools were used to analyze the performance of several benchmarks before and after application of JSS.

4. Results

The results of a specific test case we designed for loopfusion and two benchmark programs from sunspider [1] are shown in table 1.

Table1: Time spent in each JS execution and tracing phase

The first program, which we designed, tests a best-case situation where two loops are immediately next to each other, increment in the same direction, but have different bounds. The second program, modified from sunspider to allow for loop-fusion, measures binary tree access times. The third program, access-nbody.js, was the only SunSpider test that allowed for loop-fusion. The bound for access-nbody’s fusible loops, although the same was variable rather than constant. JSS was modified to account for variable bounds. Each benchmark was repeated 100 times and the average speed found. Figure 2 shows the amount of time spent in each phase of TraceMonkey, with different colors representing the different phases. The colors have been shown in the key in Figure 2(g).

Figure2: Visualization of JS execution and tracing phases

The top row of each image graph is read by reading top to bottom, left to right. The middle section shows the expected speed up of native execution over interpretation, while the bottom row shows the proportion of time spent in each phase of the JavaScript engine.

Applying our optimization should result in the reduction of the overhead, which would mean less total time spent monitoring, recording, compiling and setting up and tearing down. The first and third images show the timing without the JSS optimization while the second and fourth show the timing after applying JSS, respectively. The second graph (Fusion1-Post) shows a definite reduction in the amount of time spent monitoring and recording traces, as well as fewer number of traces, indicating the application of JSS.

The results of applying JSS to the first benchmark is clearly visible, with a greatly reduced amount of time spent in the overhead. The results from table 1 help to more clearly quantify the change, and show that the time spent monitoring and recording traces was significantly lower after JSS than before. This reduction made up for the additional time spent in compilation because of the larger amount of code being compiled. The estimated speed up increased by 0.01, indicating improved performance. However, the additional guard added to prevent the loop with the smaller trip count from executing too often caused more time to be spent in native execution and caused the benchmark to execute more slowly than pre-JSS. Future optimizations, such as loop-peeling, could remove the need for a guard and increase the performance of JSS.

The second benchmark ran over a larger period of time, increasing the benefit of JSS. The time spent monitoring traces was reduced, and the overall native execution time was much lower. Discrepancies between this benchmark and the previous could be due to better loop-bounds alignment that removed the need for a guard to prevent the fused loop from executing too often.

The third benchmark shows the greatest improvement due to JSS. The time spent in the interpretation, monitoring, recording and compilation phases was greatly reduced and the code spent more time running natively. This third benchmark highlights the benefits of JSS, benefits that could be realized in other less-optimal situations with additional optimizations.

5. Problems

The goal of JSS is to show that increasing the scope of code the tracer could use for optimization would lead to performance benefits required ignoring many erroneous situations, the most important of which is side exits occurring in loops. There are also many difficulties intrinsic to JavaScript which are out of the scope of a static analyzer. In particular global variables, with statements and eval expressions are dynamic language features that are out of.

Side exits were ignored in JSS to show that loop-fusion was a viable optimization for increasing performance. Detecting side exits would not be difficult, but would require traversing an AST after parsing completion or carrying a flag up the tree to show that a break or continue statement occurs. Traversing the AST after creation is undesirable because it would add extra processing time by adding another analysis pass. Adding flags to the tree would require augmenting the parser in many locations and adding additional information to the AST node objects.

Global variable dependencies are not particular to Java-Script, but their difficulties are certainly more pronounced than other languages. Because variables default to global, they abound in real-world scripts. The global variables then cause unnecessary dependencies because the dependencies cannot be easily determined statically. They also cause problems because any loop calling a function causes a dependency from that loop to every global variable. Methods for removing the pressure from global variables could include copying the global variable in the pre-header and using the copied name inside the loop. Analyzing called functions would reduce the number of dependencies, but would add analysis time. As shown in the example in Figure 3, both variables a and b are globals as their declaration is not preceded by the keyword var. A naive analysis would wrongly try to fuse the two for loops that are declared on lines 3 and 7. However, function bar which is called at line 4 of the first loop, changes the global variable b making the loops infusible. Hence, a more thorough inter-procedural analysis is required to handle global variables.

Figure 3: Global variables in JavaScript

With statements cause difficulties in dependency analysis because they change the namespace of code. Variables used within a with block could be referencing properties of the parameter passed to the with statement or variables defined in the local or global scopes. The properties of a JavaScript object are determined at runtime, which means that the dataflow analysis techniques used for JSS fail to determine the correct definition of variables used within the with block. As shown in Figure 4, function myObj is called to create an object with two properties (i and j) which is assigned to variable obj. The with construct on line 4 makes the properties of the object obj directly accessible to the code within the block it defines. The two while loops on line 5 and 9 use properties i and j from this object as their induction variables (instead of obj.i and obj.j). Even though these loops can be fused, a simple def-use analysis technique cannot find the definitions of these variables.

Figure 4: Use of with construct in JavaScript.

Another major difficulty occurs with the use of eval expressions. JSS performs optimistically with regard to these expressions in order to show the viability of the optimization, however, in reality eval kills the opportunity for loop-fusion. The major problem with this is that variables can be defined (or deleted!) within eval statements, and the side-effects of eval cannot be determined until runtime. If an eval statement occurs between the definition of an induction variable and its uses, then the bounds of the induction variable cannot be correctly identified. As shown in Figure 5, the eval statement on line 4 introduces a dependency between the loops and they cannot be fused. Analyzing the effects of an eval statement requires precise string analysis and is out of the scope of this paper. A reader can refer to [6] for a static string analysis algorithm for JavaScript.

Figure 5: Use of eval function in JavaScript.

A final problem we encountered was with finding appropriate benchmarks that would benefit from loop-fusion. Only the access-nbody test from the SunSpider benchmark suite had loops that qualified for fusion under our criteria. The lack of applicability of loop-fusion could very well be indicative of the programs developed or generated in JavaScript, however we have not yet done or seen research to back such an assertion.

6. Future Work

Future work would improve the safety of JSS and add other loop-related optimizations that could increase the performance of the resulting code. Currently the induction variable is identified based on the variable used in the loop condition, but true induction variable analysis would lead to a more accurate optimization. Dependency analysis, loop-peeling, and induction variable normalization would all aid in improving the performance and number of opportunities available to JSS.

Current dependency analysis marks any use or definition of a used variable as a dependency. The number of dependencies could be reduced by analyzing variable uses and reordering fused code to remove unnecessary dependencies. For example, if the same array, A, is used in both loops, the access variables can be analyzed to determine if it is a true dependence. This analysis could be performed using a standard combination of tests, such as GCD, Banerjee Inequalities, or I-Test. These tests have the undesirable quality of slowness, but a balance between analysis time and conservatism could possibly be found.

Loop-peeling could also increase the performance of JSS by removing the need for guard checks when the trip counts of the two loops are different. The guard checks add several unnecessary branch instructions to ensure the side effects after fusion remain the same as before. Although loop-peeling would reduce the amount of native code, because it would not occur within the trace, it would reduce the number of checks and branches occurring within the native code.

Finally, induction variable normalization would greatly improve the number of loops that could be fused and increase opportunity for additional optimizations. Normalization would allow loops that run in opposite directions to be fused, increasing the number of loops that could be fused by JSS. It would also increase opportunities for additional optimizations, such as induction variable elimination and strength reduction.

7. Conclusion

The JavaScript language proves difficult to optimize because of its dynamic nature and wide variety of constructs. TraceMonkey[4, 10] and V8 showed JIT to be an effective method of increasing performance, but further importance placed on this browser-focused language of choice demand further performances increases. JSS attempted to increase the performance by combining the larger granularity of JITed code associated with V8 and the tracing used by TraceMonkey. JSS increased the amount of code available for optimization by performing parse-time loop-fusion. Increasing loop size and decreasing the number of loops means there are fewer calls to the JIT compiler and thus less overhead. JSS showed the viability of this optimization but ignored some side issues due to outside constraints that will be remedied with future work.

8. References

[1] Webkit. SunSpider JavaScript benchmarks. URL http://www2.webkit.org/perf/sunspider-0.9/sunspider.html.

[2] D. Anderson. Tracemonkey and type instability in javascript. URL http://www.bailopan.net/blog/?p=112. Accessed on 06 May 2010.

[3] A. Gal. Tracing the web. URL http://andreasgal.wordpress.com/2008/08/22/tracing-the-web/. Accessed on 06 May 2010.

[4] A. Gal, B. Eich, M. Shaver, D. Anderson, D. Mandelin, M. R. Haghighat, B. Kaplan, G. Hoare, B. Zbarsky, J. Orendorff, J. Ruderman, E. W. Smith, R. Reitmaier, M. Bebenita, M. Chang, and M. Franz. Trace-based just-in-time type specialization for dynamic languages. In PLDI ’09: Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation, pages 465–478, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-392-1. doi:http://doi.acm.org/10.1145/1542476.1542528.

[5] Google. v8 javascript engine. URL http://code.google.com/p/v8/. Accessed on 06 May 2010.

[6] S. H. Jensen, A. Møller, and P. Thiemann. Type analysis for JavaScript. In Proc. 16th International Static Analysis Symposium, SAS ’09, volume 5673 of LNCS. Springer-Verlag, August 2009.

[7] K. Kennedy and K. S. McKinley. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing, pages 301–320, London, UK, 1994. Springer-Verlag. ISBN 3-540-57659-2.

[8] F. Logozzo and H. Venter. Rapid atomic type analysis by abstract interpretation. application to javascript optimization. In International Conference on Compiler Construction. Springer Verlag, March 2010.

[9] M. Wiki. J¨agermonkey – method based jit engine, . URL https://wiki.mozilla.org/JaegerMonkey. Accessed on 06 May 2010.

[10] M. Wiki. Tracemonkey – javascript tracing engine, . URL https://wiki.mozilla.org/TraceMonkey. Accessed on 06 May 2010.

A Look Into Domain Specific Languages

Not to long ago I had the opportunity to hear Charles Simonyi present on the topic of Intentional Programming (IP). In his presentation he discussed many of the underlying objectives and concepts that I found similar to those that we have in Business Process Management (BPM), the area where most of my experience lies. Reflecting the intentions of the developer (i.e. the business process) is the holly grail in terms of what we want to achieve in BPM, but where the “developer” is a domain expert rather than your traditional programmer. Unfortunately, the IP platform from Intentional Software, the company that Charles is a founder of, was not yet publicly available. As a result, I recently took a look at several other projects out their with similar objectives and concepts including Meta Programming System (MPS), Whole Platform, and XMF. Looking at these products naturally lead me to think much more about DSL(s), their architectures, and role in enterprise application development. At this point I’ll describe some of my high level findings to set the stage for some later postings and list some references for those interested in the topic.

1. Domain Specific Languages

A Domain Specific Language (DSL) is a specialized language engineered with the goal of implementing solutions for a particular problem domain. This is in contrast to General Purpose Languages (GPL) that are aimed at generally solving any type of problem. The key value proposition for a DSL is that is abstracts away the underlying complexities that are unnecessary for the targeted developer(s) in implementing a solution.

2. Horizontal vs. Vertical Domains

A domain is a problem space with a set of interrelated concepts. Domains can be further classified as being horizontal or vertical. In the context of DSL(s) Horizontal domains are those domains that are both technical and broad in nature with concepts that apply across a large group of applications. Examples of horizontal DSL(s) include the following (Note: I am biased towards Adobe technologies ;-).

  • Flex: An open sourced framework and DSL that targets the user interface domain by enabling Rich Internet Applications (RIA) that are portable across multiple browsers.
  • Cold Fusion: A language for creating data-driven web sites.
  • Document Description XML(DDX): A declarative language used to describe the intended structure of a one or more PDF(s) based on the assembling of one or more input documents.
  • LiveCycle Workflow: Similar to workflow languages such as BPEL, LC Workflow is An imperative language used for specifying the interaction of a process across one or more services.
  • SQL & DDL: Structured Query Language and Data Definition Language(s) are standard languages used for querying and defining data structures respectively with respect to relational databases.
  • Pixel Bender: An toolkit made up of a kernel language, a high-performance graphics programming language intended for image processing, and a graph language, an XML-based language for combining individual pixel-processing operations (kernels) into more complex filters.

Vertical domains on the other hand are more narrow by nature, pertain to an industry, and contain concepts that can only be re-used across a relatively small number of similarly typed applications. Good examples of vertical DSL(s) can be found in [3] and are listed below:

  • IP Telephony & Call Processing: A modeling language created to enable the easy specification of call processing services using telephony service concepts.
  • Insurance Products: A modeling language that enables insurance experts to specify the static structure of various insurance products from which a a J2EE web portal can be be generated.
  • Home Automation: A modeling language for controlling low-level sensors and actuators within a home environment.
  • Digital Watches: A wristwatch language for a fictitious digital watch manufacturer that enable the building of a product family of watches with varying software requirements.

Two important points with respect to horizontal vs. vertical DSL(s) are:

  • Its much easier to find examples of horizontal DSL(s) rather than vertical. This is in part due to the fact that developers are the primary contributors to new languages and creating DSL(s), which is a technical activity of itself. Its often a natural next step to look towards DSLs after establishing a successful API and/or framework that itself provides a level of abstraction
  • There is overlap between horizontal and vertical DSL(s) that must be managed. In his thesis Anders Hessellund describes the general problem of managing overlapping DSL(s) as the coordination problem [2].
Figure 1: Horizontal vs. Vertical DSLs

Figure 1: Horizontal vs. Vertical DSLs

3. Architectures

In this section we classify DSL(s) into three broad classifications based on architecture. The three classifications; Internal, External, and Language Workbench were first termed by Martin Fowler [4][8].

Internal DSL(s)

In our spoken languages experts typically do not invent entirely new languages to support their domain. Similarly, many argue that DSLs should be built on-top of an existing language such that its usage can be embedded in the host environment. Rather than remain true to the original syntax of the host programming language, DSL designers attempt to formalize a syntax or API style within the host language that best expresses the domain of the language.

Figure 2: Internal DSL Architecture

Internal DSL

Figure 2: Internal DSLs

The syntax support of a host language constrains how expressive an internal DSL can be to a great extent. Languages with a rigid syntax structure such as Java or C# have difficulty in supporting expressive DSLs, whereas flexible (& dynamic) languages such Ruby, Python, Groovy, and Boo are designed for such support. Projects such as JMock have shown that even syntactically rigid languages such as Java can still provide fairly expressive DSL(s) by implementing method chaining and builder patterns to give the illusion of a more natural language [5]. Fowler refers to DSLs that employ such patterns to work within the confines of a rigid syntax as fluent interfaces [6].

External DSL(s)

Unlike Internal DSL(s), external DSL(s) exist outside the confines of an existing language. Examples of such languages are SQL, XPath, BPEL, and Regular Expressions. Building an external DSL can be a complex and time consuming endeavour. An external DSL developer is essentially starting from a blank slate and while that may empower them to an extent, it also means that they must handle everything themselves.

So what exactly does “everything” entail? Similar to implementing a GPL, an external DSL developer must implement a multistage pipeline that analyzes or manipulates a textual input stream, i.e. the language’s concrete syntax. The pipeline gradually converts the concrete syntax as a set of one or more input tokens into an internal data structure, the Intermediate Representation (IR).

Figure 3: The multi-stage pipeline of an external DSL

Figure 3: The multi-stage pipeline of an external DSL <7>

The architecture of a external DSL can still vary significantly based on its runtime requirements.
The four broad subsets of the multi-stage pipeline above as described in [7] are:

  • Reader: readers build internal data structure from one or more input streams. Examples include configuration file readers, program analysis tools and class file loaders.
  • Generator: generators walk an internal data structure (e.g. Abstract Syntax Tree) and emit output. Examples include object-to-relational database mapping tools, object serializers, source code generators, and web page generators.
  • Translator: A translator reads text or binary input and emits output conforming to the same or a different language. It is essentially a combined reader and generator. Examples include translators from extinct programming languages to modern languages, wiki to HTML translators, refactorers, pretty printers, and macro preprocessors. Some translators, such as assemblers and compilers, are so common they warrant their own sub-categories.
  • Interpreter: An interpreter reads, decodes, and executes instructions. They range from simple calculators up to full blown programming language implementations such as for Ruby, JavaScript, and Python.

Language Workbench & Platform

In [8] Fowler describes the benefits of DSL(s) versus the cost of building the necessary tools to support them effectively. While recognizing that implementing internal, rather than external DSL(s) can reduce the overall tool cost, the constraints on the resulting DSL can greatly reduce the benefits, particularly if you are limited to static typed languages that traditionally have a more rigid syntax. An external DSL gives you the most potential to realize benefits, but comes at a greater cost of designing and implementing the language and its respective tooling. Language Workbench’s are a natural evolution that provide both the power and flexibility of external DSL(s) as well as the infrastructure (IDE, frameworks, languages, etc…) to greatly facilitate the implementation of the necessary tooling for the DSL. In the picture below we expand upon the concept of a Workbench to a Language Platform, which includes not only the Workbench but also the Runtime for the implemented languages as well.

Figure 4: Language Workbench & Platform

Figure 4: Language Workbench & Platform

The concept of a Language Workbench (or Platform) is relatively new. Much of initial theoretical work in this area was started by Charles Simonyi with the concept of Intentional Programming, which he started at Microsoft [10] and the work being done by his later company, Intentional Software. For purposes of this posting we will discuss two important aspects of Language Workbenches shown in the illustration above.

Multi-level Projection Editors

A projection editor is an editor for the concrete syntax of a language, whether that syntax is textual or graphical. These editors are tightly bound to their respective languages and offer the assistance capabilities which we now expect from modern editors such as code completion. Unlike, traditional free format text editors, projection editors can:

  • work directly with an underlying abstract representation (i.e. an Abstract Syntax Tree)
  • can bypass the scanning and parsing stages of the compiler pipeline
  • direct users to enter only valid grammatical structures
  • can be textual or graphical
  • can offer multi-level representations that abstract out various views of the program for various users and enabling multi-level customizations of the end solution [9]

Active Libraries

Its long been realized that compilers & interpreters are typically unable to take advantage of domain specific context encoded within the source code of a GPL. Moreover, as discussed previously DSL(s), both internal and external, often lack tooling support or acquire it through significant costs. Active Libraries enable DSL developers to overcome the overall lack of domain-specific tooling by taking an active role throughout the language tool chain, from the IDE, through the compiler, to the runtime. The are three types/levels of Active Libraries [11]:

  • Compiler Extensions: libraries that extend a compiler by providing domain-specific abstractions with automatic means of producing optimized target code. They may compose and specialize algorithms, automatically tune code for a target machine and instrument code. Czarnecki et al provided Blitz++ and the Generative Matrix Computation Library as two examples of libraries that extended a compiler with domain specific optimizations.
  • Domain-Specific Tool Support: libraries that extend the programming environment providing domain-specific debugging support, domain-specific profiling, and code analysis capabilities, etc… The interaction between Tuning and Analysis Utilities (TAU), a set of tools for analyzing the performance of C, C++, Fortran and Java programs and Blitz++ are good examples.
  • Extended Meta-programming: libraries that contain meta-code which can be executed to compile, optimize, adapt, debug, analyze, visualize, and edit the domain-specific abstractions. Czarnecki et al [11] discuss active libraries generating different code depending on the deployment context providing the example that they may query the hardware and operating system about their architecture.

For my purposes I was primarily interested in the domain-specific tool support afforded by active libraries (i.e. #2 above). In the case of Language Workbench’s we are looking for domain specific support for rendering, editing, re-factoring, reduction(i.e. compiling), debugging, and versioning capabilities.

4. Role of the Language Engineer

While the concept of DSL(s) and Unix “small languages” have been around for some time, the focus has been on slowly evolving General Purpose Languages (GPL). As a result, the industry currently lacks developers, patterns and best practices, as well as tooling that facilitate the design, implementation, and evolution of languages on a larger scale. Language Engineering is a more complex activity than typical software and system engineering. In-order for the benefits of DSL(s) to be fully realized there needs to be a new discipline evolved that focuses on engineering of domain specific languages. Similar to others [18] this paper asserts that there will be a paradigm shift that demands the creation of a new role, the Language Engineer, in the software development process. Furthermore, the creation of this role will significantly reduce the workload of application developers in building a given solution in a similar fashion to the introduction of 3rd generation languages and subsequently platforms (e.g. Java/J2EE and .Net).

Figure 5: Developer Roles

Figure 5: Developer Roles

While we would not expect to see the dramatic increase in productivity that we saw with the introduction of 3rd generation languages, we would still expect the productivity increases to be substantial once effective tooling is in place for both the Language Engineer and the Application Developer as the consumer of any given DSL.

5. Benefits & Costs

In this section we review & summarize many of the costs and benefits associated with creating and using domain specific languages.

Benefits of DSL(s):

  • Domain Knowledge: Domain specific functionality is captured in a concrete form that is more easily and readily available to both application developers, who are direct language users, and domain experts. The benefit of capturing domain knowledge can be realized throughout an applications life-cycle.
  • Domain Expert Involvement: DSLs should be designed to focus on the relevant detail of the domain while abstracting away the irrelevant details making them much more accessible to domain level experts with varying levels of programming expertise. In cases where there is already an existing notation understood by the domain expert the DSL should be designed with a concrete form to match that existing notation. For example, for users familiar with the Business Process Modeling Notation (BPMN), a related Workflow DSL should enable users to implement workflow solutions using familiar BPMN constructs.
  • Expressiveness: DSL’s that are tailored to a specific domain can more concisely & precisely represent the formalisms for the specified domain. Furthermore, DSLs tend to be declarative in nature, enabling the developer to focus on the “what” while eliminating the need for them to understand or over specify their program with the “how.”
  • Compilation & Runtime Optimizations: As discussed with Active Libraries the additional context afforded DSL(s) can provide for domain-specific optimizations at either compile and/or runtime. Adobe’s Pixel Bender language is an example of a language designed to provide optimizations by leveraging parallelization on multi core machines. Similarly, there are many use cases in the scientific community where DSL’s have been leveraged for abstraction and optimization to handle sparse arrays, automatic differentiation, interval arithmetic, adaptive mesh refinement, etc…[1] – The expressiveness of DSLs not only eliminate the need for developers to over specify their code, but consequently provides more opportunity for the infrastructure to intelligently optimize where and when possible.
  • Re-Use: Due to the additional costs associated with designing and supporting DSLs mentioned previously DSLs should not be created for isolated solutions. However, in cases where a problem in a particular domain (horizontal or vertical) is repeating there is ample benefit to be gained by re-using the expressiveness of DSLs and the other benefits that follow.
  • Reliability: Like libraries once tested DSL(s) provide higher level of reliability when re-used across projects.
  • Toolability: By associating a concrete and abstract syntax to a language, such as is the case with External DSL(s) or DSL(s) defined in a Language Workshop, tools are better enabled to analyze and provide guidance through assistance or verification tooling.

Costs of DSLs:

  • Tool Support: One of the primary benefits of creating DSL(s) vs. application libraries is the ability to add tooling based on the concrete syntax of the DSL, however today such tooling does not come for free, it must be built as part of the cost of creating the DSL. Tooling features include features such as code guidance/assistance, re-factoring, and debugging.
  • Integration: The successful development of most applications today requires the convergence of multiple views. Business analysts, domain experts, interaction designers, database experts, and developers with different types of expertise all take part in the process of building such applications. Their respective work products must be managed, aligned and integrated to produce a running system [2].
  • Training Cost: As opposed to mainstream languages, DSL users will likely not have an established specification to look to for guidance. However, this cost can be offset by the degree to which the DSL narrowly reflects and abstracts the domain concepts.
  • Design Experience: DSL platforms are not yet widely adopted in the software industry. As a result, there is an evident lack of experience in the field around language engineering, language design patterns, prescriptive guidelines, mentors, and/or research.

Conclusion

Anyway, hope to have some more postings related to DSLs in the not to distant future. At this point I am attempting to pull together a lot of good ideas from various areas. Which reminds me here is the list of docs referenced throughout this posting.

References

[1] T. Veldhuizen and D. Gannon, “Active Libraries: Rethinking the role of compilers and libraries,” page 2

[2] A. Hessellund. “Domain-Specific Multimodeling,” Thesis. IT University of Copenhagen, Denmark. page 15.

[3] S. Kelly & J. Tolvanen, “Domain-Specific Modeling,” John Wiley & Sons, Inc. 2008, Hoboken, New Jersey.

[4] M. Fowler. “Domain Specific Language,” http://www.martinfowler.com/bliki/DomainSpecificLanguage.html

[5] S. Freeman, N. Pryce. “Evolving an Embedded Domain-Specific Language in Java,” OOPSLA, Oct 22-26, 2006, Portland, Oregon, USA.

[6] M. Fowler. “Fluent Interface,” http://www.martinfowler.com/bliki/FluentInterface.html

[7] T. Parr, “Language Design Patterns, Techniques for implementing Domain Specific Languages, ” Pragmatic Bookshelf, 2009. pages 14-16.

[8] M. Fowler. “Language Workbench,” http://www.martinfowler.com/articles/languageWorkbench.html

[9] K. Czarnecki, M. Antokiewicz, and C. Kim. “Multi-Level Customization in Application Enginneering, ” Communications of the ACM, Vol I, December 2006. pages 61-65

[10] C. Simonyi. “Intentional Programming – Innovation in the Legacy Age, ” Presented at IFIP WG 2.1 meeting. Jun 4, 1996

[11] K. Czarnecki, U. Eisenecker, R. Gluck, D. Vandevoorde, and T. Veldhuizen. “Generative programming and active libraries (extended abstract). ” In M. Jazayeri, D. Musser, and R. Loos, editors, Generic Programming ’98. Proceedings, volume 1766 of Lecture Notes in Computer Science, pages 25–39. Springer-Verlag, 2000.

[12] Business Process Modeling Notation, http://www.bpmn.org/

[13] Meta Programming System, http://www.jetbrains.com/mps/index.html

[14] T. Clark, P. Sammut, J. Willans. “Applied Metamodeling A Foundation For Language Development, ” Ceteva 2008.

[15] R. Solmi. “Whole Platform, ” Thesis for Department of Computer Science at University of Bologna. Italy. March 2005

[16] B. Langlois, C. Jitia, E. Jouenne. “DSL Classification,” In 7th OOPSLA Workshop on Domain-Specific Modeling, 2007.

[17] OMG, Meta Object Facility (MOF) 2.0 Core Proposal, ad/2003-04-07, 7 Apr 2003.

[18] A. Kleppe. “Software Language Engineering.” Addison-Wesley Professional. 12/09/2008. Page 19.

Visual Programming Environments For Kids

Well, I happily finished my first semester at the Georgia Institute of Technology (GIT) at the beginning of May. It was a great semester, I coded moreso than I have in ages, had to relearn C/C++, and added LISP to my repertoire. The courses at GIT are project intensive so I was able to do some fun stuff such as writing a multi-threaded Web-Server that communicated with a Proxy Server via shared memory, implementing an inference engine in LISP, LLVM passes for detecting infeasible branches due to correlated predicates and parallizable loops, and testing out some cool robots for a HCI project that I worked on with a group of four other Georgia Tech Grad students called Bots-For-Tots. Its the last one that I’ll focus on for purposes of this entry. The project had us go through an user focused process from analysis, to design and selection, to prototyping. The end result was a programming environment we called Bot-Commander, which leveraged open source technologies MERAPI & LEJOS and of course AIR to enable children (ages 3-8) and myself 😉 to easily program a Mindstorm robot. Considering that I have 3 children, ages 1, 4, and 6 and like most geeks am immediately drawn to words containing “bot” this project was close to heart. For those of you interested in bots and/or child education below is selected content from the project.

My co-contributers included: Albert Brzeczko, Basil Udoudoh,
Dimuthu, & Bryan Hickerson

The Premise

How do we teach children technology? is a basic question that, as the ubiquity of computing in 21st century progresses, more and more parents and educators are grappling with at earlier stages of a child’s development. On the one hand the question is very important assuming that a key measure of a communities success is the number of technologists (e.g. engineering, computer science, etc…) that it outputs. On the other hand the question can be considered irrelevant since children are bound to learn “enough” about technology embedded into their communities society and culture through its application. However, another consideration is whether the question “How do we teach children technology?” is the right question to be asking ourselves. The concern is that we risk teaching technology as a set of abstract concepts that are difficult for children to learn, internalize, and apply. What is lost is that technology has the ability to serve as a platform for children at all ages to apply creative thinking across multiple disciplines and interests. That ability is largely untapped today. While there has been substantial work in leveraging technology as a learning platform for older children in certain areas, the solutions have fallen short in terms of enabling younger children (ages 3 and up) and being adopted in the mainstream throughout schools and in the home.

It was in the early 1980’s when Seymour Papert published his influential book, “Mindstorms: Children, Computers, and Powerful Ideas.” In the book Papert gave rise to a new mode of thinking called constructionism from which he argues that through technology children can now integrate the mechanical with the digital to create personally meaningful projects from which they can problem solve, test, and create new ideas and conceptual models of the world. Personally meaningful projects are those that children are driven to work on out of personal interest. Paperts research led to the creation of the Logo programming language, which was designed to be powerful enough for research yet simple enough that it could be used by children. The language was popularized with the introduction of a virtual turtle that children would teach to draw via programming.

From programmable turtles to bricks, crickets, and cats the concepts introduced by Papert have led to a host of constructionist environments that help children to learn about learning by teaching robots how to interact with the world in which they live. Through working with Papert, Lego Inc. introduced a commercial robot construction kit called the RCX Programmable Brick. Other lesser-known robot kits have become available as well and virtual robots, similar to the Logo turtle, are now free for downloading. Common constraints in all of these environments are that they typically target an age group of 8 and above and require a high degree of investment by not only the child, but also the educator (parent or teacher) in terms of training and time. In this project we will be investigating these existing tools with the goal of designing a constructionist environment that that not only targets younger children, but also reduces the cost to both the children and educators in terms of training and time, resulting in a product that is less prohibitive to mainstream usage.

The Focus Group

This section discusses the qualitative methods used for exploring the problem. We needed a focus group and we needed one fast.
The solution was a Bots-For-Tots pizza & ice cream party at my house, where I invited a bunch of my son’s classmates over (informal but it worked).

Finding Bots

Our impressive inventory of bots consisted of Lego Mindstorm and MIT’s Scratch program, which gave us a physical and virtual robot respectively. However, Lego Mindstorm targets children at the ages 10+ which we knew would likely be well above the children’s capabilities. We needed additional bots that targeted a younger age group to give an more accurate account of the current state of constructionist toys. The answer(s) lay in the acquisition of two additional robots; Pico Crickets from the Playful Invention Company and the Roamer from Valient Inc. both of which prescribed to constructionist ideas and concepts.

Composition of the Group:

  • Ages ranged from 3 to 6.
  • All the children were boys
  • All of the parents involved professed a strong interest in their kids learning about technology.
  • Two of the parents had a job directly involved with technology, while others worked in the fields of medicine, psychology, and public relations.

Execution

The plan was simple.

Step I – Grease & Sugar
First, we served all of the parents and their children Pizza and Ice Cream providing us the opportunity to talk to the parents and children about their respective background while waiting for everyone to show up.

Step II – Constructionism 101
Next, we did a learning activity that did not involve computers or actual robots. The purpose of the activity was:

  1. Learn about the children’s ability to comprehend technical concepts
  2. Provide an overview for kids and their parents of how each of the toys worked
  3. Have fun!

We asked for volunteers from the children for both a robot and a set of computer programmers. One child volunteered as a robot, while the rest volunteered as computer programmers. We then brought out two poster boards, one empty one titled “Program” and a second one containing a set of square cut outs velcro-ed to the back labeled “Blocks.” We asked the children whether robots could “think” like we do. The children’s answers were mixed, but we explained that robots cannot think by themselves and that they need computer programmers to help them think by telling them what actions to take (i.e. creating a program). The children then participated in building a program by choosing action blocks from the “Blocks” poster-board and moving them to the “Program” poster-board. After the “computer programmers” were done creating their program we had our acting robot execute it through by stepping, jumping, growling, and barking as instructed by the program.

Programming 101

Programming 101

Step III – Breakout Time
After our brief course in robotics and programming we gave a very brief introduction to the different robots we had around us and then broke up, letting the children gravitate to the robots that interested them the most. We had the five following robot stations setup.

Station I – Lego Mindstorm

At the core of Lego Mindstorm is a programmable brick that has the capability of accepting input from 3 sensory devices and controlling up to three output devices (i.e. motors). While the brick has an interface for building programs directly on it, more typically users will use the Mindstorm programming environment to build a program and deploy it to the brick using either a USB cable or bluetooth connection.

Mindstorm

Target Age Group: 10+
Observations:

  • Providing a quick look at Mindstorm proved to be the most problematic for two reasons. First, if the robot was left idle it would shutdown at which point you have to re-establish the bluetooth connection to demo it. This turned out to be an inconvenient interrupt which required that we ask everyone to please wait while we re-established the bluetooth connection. Second, when using the visual programming environment for Mindstorms each action/block has a great deal of configurations, which were often difficult to see on a large screen and impossible to walk through with young children.
  • Parents and children were intrigued with the possibility of creating a humanoid robot as shown on the Mindstorm Box.
  • Parents found the lack of organization and hundreds of small pieces for Mindstorm to be daunting.
  • Surprisingly the children showed no interest in Lego Mindstorm once we broke up across the different stations.

Station II – Pico Crickets

Similar to Mindstorm, Pico Crickets leverages a Visual Programming interface, motors, input sensors, and legos. However, Pico Crickets has two distinct differences. First, it is target towards a younger age group of 8+. Secondly, Pico Crickets strives to work with the artistic capabilities and intuition of children rather than purely mechanics (i.e. gears and motors).

Pico Crickets

Target Age Group: 8+

Observations:

  • Out of all the robots, Pico Crickets held the most attention not only from children, but also parents (one parent actually built a 7 step program). Three children spent a substantive amount of time on Pico Crickets.
  • The organization & number pieces in the Pico construction kit was much less daunting than that of Mindstorm.
  • Children seemed to either want to play with the lego pieces or the programming environment (to create pretty programs) but did not seem to correlate the relationship between the two.
  • The children using the programming environment did so to in the same way that they use legos, they were snapping together virtual blocks to create diagrams not to execute them.
  • One parent remarked that they liked the toy but felt that it required to much hand holding for the children.

Station III – Scratch IDE

Scratch is a programming language developed by the Lifelong Kindergarten Group at the MIT Media Lab to help young people learn how to develop computer programs. The development of Scratch (and its name) was inspired by the scratching process that DJs use to create new sounds and music by rubbing old-style vinyl records back and forth on record turntables, creating new and distinctively different sound out of something that exists. Similarly scratch projects allow young developers to mix together graphics and sounds, using them in new and creative ways.

scratch2

Targeted Age Group: 8+

Observations:

  • There was one child that played with Scratch, but that was the only robot that he played with the entire time. While he seemed to enjoy making Scratch (the out-of-box virtual cat robot) do stuff, he particularly enjoyed the more personal aspects of scratch that enabled him to upload his own picture and record his own voice to use in a program. Note: however that uploading his own picture is still a complex process for which he needed help.
  • Parents did show interest in the fact that scratch was free.

Station IV – Valiant Roamer

The Roamer is a commercialized version of the physical dome shaped robot that Papert initially worked on at MIT while designing the logo language. While there is a visual programming environment for Roamer, similar to the other robots, there is no link between the visual programming environment and the physical robot. The programming environment, called Roamer World, is simply a simulation of the physical robot in a virtual world. The programming interface for the physical roamer is are the keys located on top of the robot.

Valient2

Valient

Targeted Age Group: 4+

Observations:

  • Both parents and children attempted to use the Roamer once but then quickly left for another toy once it did not do as they intended (which happened to always be on the first attempt).

Station V – Wacky Wigglers Building Set

Now here is a robot that you can find in your typical toy store. While the Wacky Wigglers Gears set would not be considered a constructionist kit due to its lack of an actual programming interface we still wanted to place it out there to see how children would respond to mechanical aspect to it. The objective of the Wacky Wigglers Building Set is to piece together a robot with a whole lot of gears which you can then control using a basic forward, backward, and turn motion remote control.

Gears

Target Age Group: 5+

Observations:

  • There was substantive interest in the Wacky Wigglers Building set. At least three children spent time successfully putting together parts of a robot. One child in particular committed himself all the way through until the robot was complete. Note: there was no adult involvement in constructing this robot.

Dispelling Myths (another observation)

Another interesting observation that we all made at the focus group was that several children came to the party with a preconceived notion of what a robot was and it didn’t fit with the ones that we had prepared for the children. Instead 3 of the children assumed that robots were human looking and dangerous.
Terminator
It is our hope that the focus group has given them a different notion of what a robot can be.

The Design Stage

In this phase of the project we brainstormed up three different possible designs to tackle the problem domain we were addressing. I won’t spend to much time on each of these because we we only chose one of them in the end.

Design I – IntelliBlocks

In brief the concept here was to implement a completely hardware based solution to alleviate the disconnect that children faced when interfacing between the computer and a physical bot. Rather than programming a robot using a computer, the program would actually become part of the environment/stage that the bot was running in (i.e. the program itself was physical). Below are some pics of some of the illustrations we put together for this design.IntelliBlocks1
IntelliBlocks2The first picture illustrates a lego board that a) implies the existence of a robotic train above it and b) contains an empty sequence block that can be used to program the train. We assume there are several actions that a train can perform and that children ages 3-8 would be aware of such as (go forward, backward, whistle, etc…). The second picture illustrates the use of blocks representing those actions to build a program that commands the train to loop around the train track until it senses that it is near a station at which point it blows its whistle and completes.

Design II – SoftBots

In this design, similar to above, an objective was to alleviate the disconnect that children faced when interfacing between the computer and a physical bot. However, rather than implement a completely hardware based solution in this design we proposed implementing a completely software based solution.

Softbots1

The picture above shows a hacked up illustration that is somewhat similar to Scratch, however our goals were to 1) improve upon the personalization capabilities by reducing the steps needed for children to record audio and take snapshots of themselves and 2) provide higher level abstractions than scratch by not treating all objects generically as a sprite, but rather for the environment to be aware of the capabilities of any given object on the stage and to know what capabilities to make available based on the combination of objects on the stage. Consider in Scratch if you had a Martian sprite and a gun sprite, one solution program the Martian to pick up the gun would be to tell the Martian sprite to move in the direction of the Gun until a color was detected and then to switch the “costumes” of the sprite to show it holding the gun. Rather we would prefer the programming environment knowing that there was a Martian bot and a Gun Bot and accordingly enabling the capability for the Martian to pick up the gun by making a high-level action “Pickup Gun” available when the Martian is selected.

Design III – Bot Commander

See Prototype

Prototyping Time

The Bot-Commander was a software/hardware based design that ended up being what we believed to be the most effective and feasible design solution that we could prototype within the given time frame (<2weeks) and with the available resources. Moreover, thanks to a presentation given by Andrew Powell around MERAPI, Mindstorm, and AIR at a recent AFFUG meeting we had heightened confidence that our goals could be achieved.

Jumping right to it, the programming environment (as an alternative to the IDE provided out of the box with Mindstorm) is shown below.

Bot-Commander

Bot-Commander

Note that the user has a set of actions on the left hand side that he or she can drag onto a canvas. There are actions for movement, sound, an sensors. The program above will wait until a sound (such as a clap) occurs then move the robot forward, to the right, in a circle, laugh, cry, and finally, play a tune.

Architecture Talk

Before considering usability we will start off with a high level view of the architecture of the prototype, which is reflected in the diagram below.

botcommander-arch
Bot-Commander Architecture

Fortunately, from an architectural perspective there was a great deal of functionality already available in the community that we were able to leverage in order to prototype Bot-Commander. Here is a brief summary of the various components that made up the Bot-Commander architecture.

  • Bot-Commander – This is the UI that was implemented by the Bots-For-Tots team to effectively replace the Mindstorm visual programming environment with an alternative that is targeting younger children (ages 3-8). The UI was implemented using Adobe’s Flex/Actionscript technology and is hosted within Adobe’s Integrated Runtime (i.e. AIR), providing the best of two worlds; the web and power of desktop computing. By leveraging AIR Bot-Commander can tie in more closely to the users desktop to interact with Merapi and LeJOS.
  • Merapi – Not only is Merapi an actual volcano on the actual island of Java, but it is (more importantly this team would argue) a bridge between Adobe AIR applications and Java. Merapi has been designed to run on a user's machine, along with an Adobe AIR application and provide a direct bridge between the Adobe AIR framework and Java, exposing the power and overall capabilities of the user's operating system, including 3rd party hardware devices.
  • Bot-Command Generator – Implemented as a Merapi message handler the Bot-Command Generator is responsible for interpreting a sequence of actions deployed from the Bot-Command UI, generating a Java program from those actions, and then compiling, linking, and uploading the compiled binary up to Alpha Rex (via LeJOS).
  • LeJOS – As and open source Java programming environment for the Lego Mindstorm NXT LeJOS was critical for us to get up and running a prototype. LeJOS allows Java developers to program Lego robots. LeJOS consists of:
    • Replacement firmware for the NXT that includes a Java Virtual Machine.
    • A library of Java classes (classes.jar) that implement the leJOS NXJ Application Programming Interface (API).
    • linker for linking user Java classes with classes.jar to form a binary file that can be uploaded and run on the NXT.
    • PC tools for flashing the firmware, uploading programs, debugging, and many other functions.
    • A PC API for writing PC programs that communicate with leJOS NXJ programs using Java streams over Bluetooth or USB, or using the LEGO Communications Protocol.
  • Tiny VM – An open source, Java based replacement firmware for the Lego Mindstorms RCX & NXT microcontrollers. TinyVM's footprint is about 10 Kb. The project was forked into LeJOS back in 2000, where the Tiny VM is now a component of a larger architecture for programming Mindstorm robots.
    </li

  • Alpha Rex – Known as Roby by one of the team member’s kids, Alpha Rex is the robotic hardware that children can now program using Bot-Commander. Mindstorm robots can take on many forms, but Alpha Rex was chosen for this project due to a) his humanoid form, which often invokes curiosity in both adults and children and b) his maximization of the use of sensors and motors.

There are few other things to note with respect to the architecture diagram above. First, outside of Alpha Rex, everything else is running as one application on the user’s desktop. Second, Merapi, Bot-Command Generator, and LeJOS are running in co-operative process that is hosting an instance of the Java VM. Communication between the Bot-Commander UI (running in AIR) and the Java components happens through passing serialized objects in the Action Message Format (AMF, a format for object remoting) over sockets. Third, communication from the Desktop to Alpha Rex happens over either a USB cable or Bluetooth. In both cases, LeJOS leverages open source projects to implement the communication.

Usability Talk

It was interesting to see during our initial focus group that one of the robots that cultivated the least amount of interest from children 3-8 was the Mindstorm robot. This is interesting because Mindstorm is a) supported by a well-known toy manufacturer (i.e. Lego Inc.) and b) the most popular of any robotics kit among teenagers and/or adults. However, its not surprising from the point of view that Mindstorm does not target children as young as 3-8.

So why you might ask did we decide to use Mindstorm as a basis for the prototype? The answer is simple; Mindstorm provides an extensible environment from which to build effective prototypes, extensible enough to the point of replacing its visual programming tool.

While we identified a good number of issues at our focus group with children ages 3-8 using robots and they’re respective programming environments, for purposes of the prototype we attempted to address only few of the more critical issues that we saw. The goals were to provide enough to a) complete phase IV of the project and b) provide an environment that kids can begin enjoying now.

Issues the prototype was addressing:

  • Hardware Abstraction – While children in our target age group tend to understand and able to identify objects such as trains, cars, dolls, and yes even a robot, none of the children that we interacted with in our focus group were familiar with more primitive objects such as gears, motors, and sensors (not to mention that many of those primitive objects are choking hazards;-). Having to deal with primitive hardware objects posed a significant barrier to a child’s success in accomplishing their desired end goal of programming the robot.
  • Connectivity – Both of the robot kits we had that had both a hardware and software component had connectivity issues. With Mindstorm in particular children were confused when the robot had automatically turned off and we had to explain to the children that the Bluetooth connection needed to be re-paired. The children moved onto the next robot, while the connection was fixed, but never made it back.
  • Layout – Each of the programming environments that the children used in attempting to program the robots had varying levels of complexity, with Mindstorm being the most complex. Children did not understand how the placement of the actions in the program meant different things (e.g. connecting actions made a sequence, while disconnected actions implied parallelism).
  • Software Abstraction – Mindstorm in particular had very primitive programming constructs (i.e. actions). If a robot has a claw the child might expect to have a “close claw” and “open claw” actions, however, in Mindstorm for example almost everything is controlled as motor A, B, or C.
  • Keyboard Usage – We found that children could play effectively with a visual programming environment such as scratch when the majority of user interactions they performed were done through the mouse. More complex interactions that involved typing with the keyboard acted as a barrier to accomplishing the primary tasks of creating a program.

The following sections will briefly cover what we did with the prototype to overcome these issues:

Hardware Abstraction

To overcome the issue of dealing with primitive hardware objects we are assuming that children are starting off with a complete robot. In the context of Mindstorm this means that the robot has already been built (bypassing a major step). While it would seem quite feasible to imagine that children might still piece together robots using less primitive objects such as a claw piece (which is more of an accessory/attachment for an existing robot rather than a building block), nothing such as that is currently available for Mindstorm and for purposes of the prototype we assume a complete & already accessorized robot.

We started off with:

Mindstorm Out-Of-Box

Mindstorm Out-Of-Box

And have ended up with:

Alpha Rex Finished

Alpha Rex Finished

Connectivity

We have for the most part eliminated the complexities of connectivity at this point, by not requiring a connection to be configured. Instead, once the user has decided to run their program we dynamically look for the Robot over either USB and/or Bluetooth and upload the resulting program. Preferably we would also have a means of showing the user when one or more robots is detected in the area by polling the Bluetooth connection every few seconds assigning them default user friendly names, however we have not added this feature as of yet.

Below is the Bot-Commander UI which shows the Run button that will be used to find the Mindstorm robot, upload the program (that would be designed on the right hand canvas), and then run it.

Layout

Here we attempted to make the layout as easy as possible. Of course implementing any type of diagramming tool in the relatively short amount of time we had for this phase of the project was challenging. At this point, users can simply drag actions from the left hand pane of the Bot-Commander and drop them onto the design canvas on the right hand side. The difference between our implementation and the visual diagramming environments of the other tools we looked at in the focus group is that sequence is assumed as you drop actions onto the right hand side. Users do not have to visually attempt to snap pieces together or draw edges between actions separately. Instead, they drag and drop, sequence is assumed, and the edges are drawn automatically to reflect the assumed sequence. A major constraint at this point, however is that we do not allow re-ordering without starting from the beginning.

Software Abstraction

Mirroring the hardware primitives, all of the visual programming environments we looked at in the focus group also had software primitives to deal with. Even Scratch, which did not force the use of hardware, but primarily dealt with virtual bots did so at a fairly primitive level (e.g. all “bots” are actually sprite/2d image objects with a fairly limited set of capabilities that are generalized across all sprites). Consider in Scratch if you had a Martian sprite and a gun sprite, to program the Martian to pick up the gun would require telling the sprite to move in the direction of the Gun until a color was detected and then to switch the “costumes” of the sprite to show it holding the gun. Rather we would prefer the programming environment knowing that there was a Martian Robot and that he/she had the capability to pick up a gun and reflect that by making a high-level action called “Pickup Gun” available.

For the purposes of this prototype we are limited to abstracting away the configuration required to do move operations in Mindstorm. Rather than having a single Move action (as is the case in Mindstorm) that requires the user to know what motor they are dealing with, how many revolutions to perform, and in what direction we are summing up this behavior in two separate actions; forward and backward. Bot-Commander assumes the number of revolutions to make the robot step in either direction and even the direction itself is assumed based on the type of robot built (i.e. Alpha Rex).

Keyboard Usage

Currently, everything that can be done in the Bot-Commander can be done via a Mouse. It is our intent to maintain that constraint on the design as much as possible.

Limitations

While we have been successfully in getting the base architecture implemented and addressed some usability concerns, there are a good number of relevant features that we were not able to get to at this point. In particular, we were not able to get to the personalization features which we found to be quite effective for children with both Scratch and Pico Crickets. In addition, we did not have any graphic artists on our project team to create effective images to represent the actions within Bot-Commander. Packaging up the configuration and making an installer for this app would take some extra effort as well since there are platform specific dependencies with respect to USB and bluetooth drivers.

Kudos to MERAPI and LEJOS

At this point I felt obligated to throw some Kudos at the two open sourced projects we used, both greatly accelerated the the rate at which we could work and did as advertised. Its not to often that you use software and it just works. Both Merapi and LEJOS did just that.

Finally!!

Well I was to lazy to do much editing, instead copying and pasting over sections. Just to sum up our work above we ended the project by performing some user & heuristic evaluations, providing us with feedback on the prototype. Unfortunately that aspect was rushed and while useful was not unbiased (i.e. my kids were the only ones to do user evals) That is where the work ended! However, I am hoping to translate this work over to an example within an investigation I am doing on Domain Specific Languages.

References

Here are some references for those interested.

Scratching over the Holidays

For the past few weeks I have been working on helping my kids better understand what I do at work everyday, i.e. programming. Admittedly I do not not program everyday but they will have years to learn about other topics such as organizational politics and communication overload , so no reason to start on those topics yet and besides my eldest son has always shown a high degree of interest in computers and the workings behind them.

Keep in mind that my kids are fairly young to understand programming concepts. My two eldest are 4 and 6. You cannot have high expectations at that age for programming but just provide them the knowledge and the tools so that their creative minds can go to work & have fun and that is what I attempted to do.

Before diving into the details of this programming experiment below, for those just interested in the end product of our holiday weekend, the “Thor’s Christmas Feast” project, click on the following image.

Thor's Christmas Feast

Thor's Christmas Feast

Where to Start?
First question you are confronted with when attempting to show your kids how to program are “what tools to use?” I knew C++ would be out of the question and considered what I could do with Java or ActionScript. I then did the next logical thing, I Googled the topic (kids & programming) and eventually came across some very interesting tools, which I hadn’t heard about before; Scratch and Alice. After a brief look at both I decided to start with Scratch and later try out Alice.

Note: For parents with older kids I also came across another excellent resource, “Java Programming for Kids, Parents, and Grandparents,” written by Yakov Fain.

What is Scratch?
Scratch is a programming language developed by the Lifelong Kindergarten Group at the MIT Media Lab to help young people learn how to develop computer programs. The development of Scratch (and its name) was supposedly inspired by the scratching process that DJs use to create new sounds and music by rubbing old-style vinyl records back and forth on record turntables, creating new and distinctively different sound out of something that exists. Similarly scratch projects allow young developers to mix together graphics and sounds, using them in new and creative ways.

Technically Scratch is an interpreted dynamic visual programming language based on and implemented in a derivative of Smalltalk, called Squeak (hmmm.. need to play around with that as well). Being dynamic, Scratch allows code to be changed even as programs are running. As a visual language, Scratch does not involve writing code in the traditional sense; instead, you write “code” by dragging and dropping component blocks onto a graphical editor (Hey, Sounds like LiveCycle;-).

Developing a Lesson Plan
Admittedly, I am not a teacher and know little about the process of education, however like most parents I am driven to learn more about education both to help me communicate with my kid’s teachers at school and to help my kids learn at home as well. If anyone has any good recommendations on resources, please forward them to me!

My first step in teaching my kids programming concepts in Scratch was to write a list of goals to attain:
1) Learn the Scratch terminology and Environment
2) Co-develop a program to make our dogs bark
3) Co-develop a program to make our dogs spin
4) Have them write a program by themselves
5) Co-develop a game and publish it to the Scratch WebSite

Goal 1 – The Scratch Terminology & Environment
First we talked about this computer program called Scratch which allows you, a Computer Programmer, to write programs such as video games. Next, we discussed the primary concepts within Scratch of a Stage and Sprites. Technically the stage is the area in the Scratch IDE, located in the upper right hand side, where your Scratch applications execute. For my kids, however, the stage is the area where all the Sprite pictures play with one another. Then we opened up Scratch and identified the purpose of the various panels in it as shown below.
scratch-env

  1. The Stage: Place for Sprites (i.e. pictures of things) to play with one another
  2. Sprite List: Panel for selecting a Sprite to work with
  3. Code Blocks: The top panel is a set of Category buttons. You can see different types of blocks by clicking on different categories. Code Blocks are what computer programmers use to tell Sprites what to do. The color coding for the various categories seemed to help my kids immensely when referring back to categories of code blocks.
  4. Script Area: Technically the place where you compose programs by assembling code blocks. but for my kids its the place you move blocks to tell Sprites what to do.
  5. The Flag and Stop buttons: The two primary buttons for starting and stopping Scratch applications respectively.

This part went fairly smooth and after a few sessions of messing around my kids quickly picked up on the concepts of Stage, Sprite, Block, Script, etc…. By pulling up some of the sample apps such as Pong they learned how the Flag and Stop buttons worked while having a bit of fun as well.

Goals 2 & 3 – Co-Developing Applications
After learning the environment and trying some samples we set off to create our own very basic Scratch applications. I have to admit that I initially didn’t get it as to why a Kids programming language should be so media centric (yeah I am a little slow like that), but after seeing my kids interact with Scratch it became much more clearer to me. One of the nicest things I saw with the Scratch was that it personalized the development experience in new ways by making it easy for my kids to add personalized content and actively participate in the development process. Not only could they develop abstract programs to do mindless things with a Cat or a box, etc… but they could add THEIR own pictures and THEIR own voices to the Scratch environment which has given them hours of fun and driven them to learn. Awesome Job to the Scratch dev team!

Anyway, over a few sessions I sat in the drivers seat while the kids and I worked on making Thor, our English Bulldog, bark and spin.

Goal 4 – Have the Kids Write a Program Themselves

For this one both of the kids chose to make Thor Bark. Now the interesting thing for me here was the differences in my two kids. For my older son it probably took about half an hour for me and him to walk through creating a scratch application with Thor barking. He clearly knew what he was doing and needed to be done (We had done it several times with me in the drivers seat), but he was more interested in the process of creating the app than the actual end result, so he ended up asking me TONS of questions along the way (e.g. Why do Sprites need commands? What if I use the move Block instead?, etc….). My younger son on the other hand cared little about the process, but definitely wanted to see the end result. He managed to complete the program in a couple of minutes with little direction from me with the anticipation of hearing him self say “WUFF” repeatedly 😉

Another interesting point here was that I noticed that this was the first time that my younger son had learned how to drag and drop an object on a computer (in this case blocks onto the script area), that was the bulk of my help to him here. I was amazed how quickly he learned to do it and how quickly it seemed to become second nature (now compare that to his Grandparents, thats a different story 😉

Here is the source code for the Bark application.

Bark Script

Bark Script


This script simply tells a Sprite (named Thor in this case) to make a sound “Wuff” when the Flag is clicked. Simple aye?

Goal 5 – Co-develop a game and publish it to the Scratch WebSite
Okkk…. Now it was MY time to Shine!! I quickly grabbed back the drivers seat and went on to create an actual game in Scratch (The inner geek acting in me). The kids helped me to pick the theme (once again dog related). We decided to make a game where our dog Thor was eating all the goodies (Candy, Cookies, and Cake). Realize that my kids have a lot of experience with getting tackled by this 40 lb brute if they ever dare to venture around the floor with food in their hands. The boys learned this harsh lesson early on, but my daughter (age 1) is still in the learning process unfortunately.

My 6 year old did the Wuff sound for the game, while my younger son came up with the idea of making Thor get “fatter” as he ate more food and “smaller” as he missed food. Unfortunately, they couldn’t sit and watch me develop the program because well, it took me to long (couple of hours at most). I think it only took that long due to the learning curve I had with Scratch myself. In the and I was pleased with the end result (My wife was not so impressed). While I found scratch lacking in some ways (lack of functions, re-use, objects, encapsulation, polymorphism, etc…) the composition capabilities where impressive and the ways it dealt with expressions and variables visually. Moreover, building parallel threads in Scratch came naturally to the point that you weren’t even aware that you were doing it, which is the first time I can say I have seen that in an IDE.

Finally, another great aspect of the Scratch experience was the ease with which it enables you to publish and share code through a Scratch hosted service. You can find tons of great applications and examples at –> http://scratch.mit.edu/. This was yet another great lesson for the kids. They can investigate games out their on scratch and easily download the source and see how it works. Now I don’t think my kids quite grasp all of that yet, nor do they quite know the process of downloading the code themselves and opening it within the Scratch IDE, but I see it as a great foundation for working in a more open and collaborative development environment.