Course recommendations as a graph

To browse Aalto University’s course selection, there is fairly new and concise site for that. One interesting novelty is recommendations, presented as a Related courses section at the bottom of the course page.

They allow harvesting, so I became curious on how would a course network look like. Are some courses more recommended than others? Which are they?

While working on this, I learned new things about Python both as a harvesting platform, and as a tool for constructing a directed graph in the form of a GEFX file. A nice example by Christopher Kullenberg helped here a great deal.

The final graph is visualized by the GEXF JavaScript Viewer. The network layout is ForceAtlas2 by Gephi, with default parameters. The size of the node (=course) reflects the number of In-Degree ranking; the bigger the circle of the course, the more there are courses that recommend this particular one.

The color of the node reveals the School. The RGB values are taken from the Aalto University Visual Identity instructions.

Click a node, and an information panel opens up to the left.

Most of the nodes come with core metadata like title, number of credits, and description. If these are missing, it most certainly means that the harvester didn’t find anything because my Python code was too optimistic. Although the course pages are built with similar HTML elements and attributes, there do are exceptions. For example, some 50 course titles are not within an H3 element I realized. Because the harvest took more than three hours (!), I didn’t want to bother the site with a re-run. Those few nodes with a high amount of In-Degree links but without any course metadata, I edited manually in Gephi’s Data Laboratory before exporting the data.

By default, after the Gephi process, inbound links – i.e. courses that the node in question is recommended by – were listed nicely and correctly in the information panel. However, outbound links – courses than this node in question recommends – were not. The course code was OK but the title was, incorrectly, the title of the source node. After digging into the JavaScript code of the viewer, and after some more pondering about what to do when I found out that indeed the label of the source node was used as the title of the outbound link, it dawned on me that I could perhaps make use of the idle weight attribute of the edge (i.e. of the link/recommendation). Luckily, with only minor modifications to the JavaScript code it worked.

I guess I could have done the GEXF modifications I needed within Gephi too but decided to brush up my dormant XSLT, once an everyday language at work due to frequent needs of XML transformations but today an exception.

So, which courses are recommended the most?

Number one is MUO-C3007, Design traditions, a bachelor-level course on the legacy of design, provided by the School of ARTS. The course is recommended by all other Schools of Aalto except SCI, School of Science. By hovering the cursor on top of the inbound links you can see how far, network-wise, some recommendations come from. I guess the design of physical artifacts follows similar historical traditions no matter what the realm of the final product is.

The second most frequently recommended one, not far behind MUO-C3007, is A23E53015 offered by the Open University, a masters-level evening course How to manage and assess the power of the brand (my translation).

Which courses are the most active in making recommendations of others, you might ask. Well, differences in rankings this way are hard to discern. Most courses recommend many others.

Jupyter notebooks, and XSLT code are available at GitHub.

Posted by Tuija Sonkkila

About Tuija Sonkkila

Data Curator at Aalto University. When out of office, in the (rain)forest with binoculars and a travel zoom.
This entry was posted in Uncategorized and tagged , , , . Bookmark the permalink.

Comments are closed.