An Important Distinction Between Machine Learning and Human Learning

Upon rereading the first chapter of Hands on Machine Learning, in addition to reading other sources, I’ve revised my answer to the difference between parameters and hyperparameters, and in doing so I also realized a key difference between machine and human learning. During the past few weeks, my answer to the question was bugging me, sitting in the back of my head, expecting clarification. As a result, I focused on finding a better answer. However, after finding a more correct answer, I still hold some previous bias about it. Machines don’t do this.

Unless intentionally programmed to do so, a machine won’t care about what it has or hasn’t “learned”. It’s indifferent to its own output. This is most evident when it comes time to retrain a model with new data. If the machine derives new answers, it doesn’t apply those new answers to the context of the old answers (unless specifically programmed to do so), and it will never revisit decisions on its own. It painfully replaces all old data with new data, whether it’s better data or not.

Even reinforcement learning, which draws new conclusions from new circumstances, produces output the machine is indifferent about. The machine beats our best chess champion, and we stand in awe. It beats our best Jeopardy champion, and we’re amazed. It wins at Go, and we say, “Now it’s learned to be smarter than us”. Nothing could be further from the truth. It simply calculated a better decision than it previously did. It has no personal interest in playing these games. It’s completely indifferent.

Christina Wodtke explains what I’m trying to say with much more elegance in this excerpt from her book, Radical Focus. she talks about different ways to learn, or what I would call learning components. Relating this to machine learning, the instructional component would be similar to setting up a model for the machine to run, running the model would be similar to the action component, and reflection would be what’s mainly lacking from our current models. If we could sufficiently build models that favor reflection, then the need to explicitly instruct and train machines would be reduced, and we would truly have a machine that “learns” new things on its own. I propose that this is entirely possible to do with today’s technology, and I suspect that the main reason it hasn’t become reality is that it’s not yet financially beneficial.


A Clarification of Machine Learning

In my previous post I gave 3 different definitions of Machine Learning. One of those definitions was something along the lines of “training a machine to learn answers on its own”. Although I had already stated such a definition lacks context, I should’ve outright said it is either highly misleading or just plain wrong. I may get in trouble with several marketing folks for saying this, but machines do not “learn” anything in the sense that we usually define “learn”. Machines, or computers to be more specific, are powerful calculators. They have always produced answers based on specific calculations, and the field of machine learning is no different. Don’t get me wrong. They can be programmed with the most efficient algorithms possible to produce amazing results, but never in the process are they learning anything from those algorithms. They take data in, process the algorithms, and calculate a result. Period.

That said, the holy grail of machine learning is what we call AGI, and as of today we’ve made very little progress toward it. In fact, if we’re ever to make significant progress toward it, I would argue that we need to harness something completely outside of our existing calculations to get there, and even then we may not reach it using calculations alone. AGI, or Artificial General Intelligence, is the ability of a trained machine to apply its knowledge directly to a problem that it has never been trained on. A correlation for humans would be similar to our process of sensing danger. When we are young, we’re told not to touch the stove. Eventually we touch the stove. Whether we did so intentionally or accidentally, from that point on we have a very real experience that reinforces our training (I’ll talk more about reinforcement learning later). Through this reinforcement experience we gain several new inputs, which we associate with the fact we’ve been told not to touch the stove. Later when we approach a campfire, we feel the heat, see the flame color and movement, and we’re internally alerted, “Don’t touch the campfire!” We didn’t have to be told by someone else. We had a general idea that the combination of inputs were similar enough to the stove, and we correctly applied our knowledge to the new situation. What’s more, we never have to prove our decision to be correct. We just know it is.

Similarly AGI would apply past knowledge, learned in different settings, to new scenarios. Some strides have been made toward that end, mostly in the subcategory of machine learning labeled “reinforcement learning”. However, we can’t escape the fact that all machine learning is calculation-based. The computer doesn’t consider inputs it hasn’t been fed, and it doesn’t reach outside of the algorithms it’s been programmed to use. Reinforcement learning is, at its heart, just a different set of calculations that have been explicitly programmed. If you don’t tell the machine not to touch the campfire, it will walk directly into it with devastating results, regardless of what you believe it has “learned”.


Hands-on Machine Learning

In my previous post I said I’m beginning a journey in the field of data analysis and machine learning. At the start of this journey I wasn’t sure which discipline I was going to lean more toward. I’ve made a decision, but it requires a bit of explanation. First of all, what do these two terms mean, anyway? Machine learning can be defined as “programming the un-programmable” or “giving computers the ability to learn on their own”. Both of these definitions are lacking some context, so I’ll provide my own answer later. I think of data analysis as “understanding the meaning of data”. Data analysis and machine learning can be used in conjunction in a computer to analyze data and produce meaning from it.

To explain the two fields further, I’ll provide an example. Say you’re running for public office. Naturally you’ll want to figure out where to focus your efforts to help get you elected. Should you be more involved in community relations? Should you improve your speech tone and hand gestures? Which regions should you campaign in? When should you kick off your campaign? You can find reasonable answers to these questions through data analysis. By feeding data into equations you can derive conclusions from it and figure out what amount of effort to give to all the areas involved in running your campaign. Machine learning similarly takes data in but, as opposed to being analyzed by people, entrusts a computer, having been fed an additional input of parameters, to produce results that you will ultimately act on.

Both fields are important. Both fields are interesting. I could proceed down either path, but given my past experience and strengths, I’ve decided to primarily pursue machine learning. As I have time I’ll likely mix in more data analysis, and ultimately I’d like to end up in a position where I can use either discipline equally well.

In my previous post I also gave kudos to Jason Brownlee. I will forever be indebted to him for encouraging me to get beyond the difficulties I was experiencing early on, to continue moving forward instead of waiting for the perfect conditions. My most recent step in this direction has been to begin reading and walking through the examples in the book Hands-on Machine Learning with Scikit-Learn and Tensorflow. There is a second edition of this book due out in October, which incorporates the Karas framework that I’m very eager to learn, but I decided I couldn’t wait that long to get started and I plan to go through the second edition at a later date after it is released.

I normally skip quizzes and exercises when I’m in learning mode. After reading the first chapter, however, I felt that this time around blogging my answers could help reinforce my learning. I’m also a firm believer that if you can’t teach something yourself, you haven’t learned it well enough. If I remember to, as my learning progresses at some point in the future I’ll come back and grade myself on how well I grasped these concepts today. Then I’ll work on relearning the points that I didn’t quite get right. That said, here are the questions of the first chapter:

1. How would you define Machine Learning?

I already stated a couple common definitions above, but now I’ll state my own. As I understand it, machine learning is the practice of providing computers with data and parameters to analyze data, with the goal of determine its meaning. The results of machine learning can be used to help people make decisions or, as a sub-field of Artificial Intelligence (AI), it can be used to directly automate the decision-making process.

2. Can you name four types of problems where Machine Learning shines?

“Big data” has become a hot buzzword in the past decade. Machine learning works very well when it has very large amounts of data to work with. “Big data” problems include recommender systems, facial recognition, email spam filtering, and fraudulent transaction detection.

3. What is a labeled training set?

The most common type of machine learning in use today is called “supervised” learning. In supervised learning we feed our model both the data and the result we’re looking for from that data. For example, we may provide a collection of emails, all of which have already been labeled by someone as either “spam” or “not spam”. We use labels in this type of model to help train the machine learning model to “figure out” what makes an email fall into one or the other category. The trained model can then be used to predict newly received emails and determine what label should be assigned to them. In this particular example, most email providers take the process a step further and actually remove emails from view if they have been labeled as spam.

4. What are the two most common supervised tasks?

As hinted from the example in my previous answer, supervised learning can be used to help categorize data. Continuing the email example we can not only determine simple category buckets such as “spam” vs. “not spam”, but we can also train and predict additional category buckets such as “high”, “medium”, or “low” priority. The second most common supervised task is what’s known as “regression”, a name which is not very intuitive and should be considered a misnomer, but it’s been used so often that we’re basically stuck with it. In machine learning, “regression” means predicting a numerical value, such as stock price prediction.

5. Can you name four common unsupervised tasks?

Today, unsupervised learning is most commonly used as a way to support decision-making processes, but is not often used to produce specific results. One common unsupervised task is looking for patterns in data to determine what individual items have in common. For example, you may provide a model with air quality, water quality, life expectancy & regional data, and it may group related inputs into categories. Another unsupervised task is determining weights and biases to be used in other machine learning models. A third task is anomaly detection. Unsupervised learning is also used in visualization. Lastly, an exception to the type of model which is used only to support decision-making would be something like news articles that are grouped based on similar content. In this type of model, the exact label of the news articles is irrelevant. It’s their content that actually matters.

6. What type of algorithm would you use to allow a robot to walk in various unknown terrains?

A supervised regression algorithm would produce a good result for this. Alternately, reinforcement learning, which I’ve not yet defined, might also be ideal. A regression algorithm, however, given a sufficient amount of data can predict the speed, pressure, direction, or other parameters needed to successfully walk in the terrain type it’s being presented with.

7. What type of algorithm would you use to segment your customers into multiple groups?

This sounds like the first task I mentioned in my answer to question 5. An unsupervised algorithm can be used to determine which types of customers buy which types of products, or which customers contribute the most to the bottom line, for example.

8. Would you frame the problem of spam detection as a supervised learning problem or an unsupervised learning problem?

Although I’ve already answered this by making it an example of supervised, unsupervised learning can be used to support the process. At the end of the process, however, we would benefit more by getting a specific label than just grouping things together, which most unsupervised algorithms would do.

9. What is an online learning system?

As opposed to a batch learning system, which is fed a large amount of data before it works on producing results, an online learning system produces results incrementally, gradually making corrections to future results as needed. As a comparison to our own human learning, online learning is like a child finding out that hairy animals are called cats, then later being presented with another type of hairy animal and told it is not a cat, but is actually a monkey. As we see more and more types of animals, we’re better able to determine which animals belong to which categories, and why.

10. What is out-of-core learning?

I don’t fully understand this term as described in the text (at least not well enough to explain it myself). Since I’ve not heard the term before today, I don’t believe it to be very important to know, so I’m skipping this question.

11. What type of learning algorithm relies on a similarity measure to make predictions?

This question might be hinting at K-Nearest Neighbor, which looks at the labels given to inputs with similar features.

12. What is the difference between a model parameter and a learning algorithm’s hyperparameter?

I’m not used to this terminology, but if I assume that “model parameter” is the same as a “feature”, then features make up the input data that a model is fed at the beginning of the process. Hyperparameters are additional inputs which are applied to the feature data. Features are known. Hyperparameters must be discovered.

Edit: Not only did I misunderstand this question the first time around, but I also was confusing hyperparameters with weights. Weights, or model parameters, can be initialized by the user, but they are generally modified by model once it runs. Hyperparameters, on the other hand are generally considered to be statically applied to an algorithm prior to execution and unchanging throughout. Honestly, I’m not convinced it matters much what you call them, so much as it matters that you know they exist. In fact, the line between parameters and hyperparameters is blurry, and it would be much easier if we didn’t try to draw distinctive lines between the two and simply just decide to call every knob that can be adjusted in a model a parameter.

13. What do model-based learning algorithms search for? What is the most common strategy they use to succeed? How do they make predictions?

Maths, basically. I’m not sure what more this question is trying to ask.

14. Can you name four of the main challenges in Machine Learning?

What first comes to mind is “underfitting” vs. “overfitting”. I’m not sure if those can be considered two different challenges, but since the approaches to solving them are significantly different, I’ll call them two things. Irrelevant and incomplete data can be two more challenges to overcome.

15. If your model performs great on the training data but generalizes poorly to new instances, what is happening? Can you name three possible solutions?

This is a what’s known as “overfitting”. This means that your training model was too specific when it calculated what meanings to attribute to the data on an individual basis. One possible solution is loosening or “regularization” of the hyperparameters used to fit the data. Another possible solution is to collect more data, although this may not always help. A third solution is to analyze the features, determine their relevance, then either weight them differently or completely eliminate them from the model.

16. What is a test set and why would you want to use it?

A test set is either a subset of the data, or an entirely different data set which is used to test the training model after it has learned using the training data set. You would use a test set to determine how well your model has been trained.

17. What is the purpose of a validation set?

This set would be used to predetermine how well a training model is doing prior to being scored against a test set. A validation set can then be used to tweak the model while training is in process.

18. What can go wrong if you tune hyperparameters using the test set?

I’ll have to guess at this one, because I don’t entirely understand the concept, but I would guess that if you use up your test set in this manner (for something a validation set is meant for), you’ll likely overfit, but you’ll be unaware that you’ve done so because your test set will incorrectly align with the training set.

19. What is cross-validation and why would you prefer it to a validation set?

I hinted at this in my answer to question 16. Cross-validation can be used to subdivide a data set dynamically into smaller training, validation, and test sets. One benefit of cross-validation is that you can repeat the process with different combinations to ultimately obtain better results. Cross-validation can also be a strategy used to help make up for a lack of data, or it can even be used to reduce bias that may be intentionally or unintentionally introduced in static validation and test sets.


Data Analysis and Machine Learning

I don’t have a lot to say at the moment, but I just had the thought that I should update this blog with my recent activity, which has mostly involved data analysis and machine learning. At the current moment I’m not sure which specific path I’m going to gravitate more toward, but chances are I’m going to end up using both in the future, as they’re closely related. I’ve recently begun attending self-study courses available in edX and Coursera regarding both topics, and at this specific moment am delving into Statistics.

I’m loving what I’m learning, and I’m eager to get started on using it practically. Shout out to Jason Brownlee over at machinelearningmastery.com, who has inspired me to actually get started with practical application now, rather than waiting until I feel I know “enough” to get started.

I’ll likely be posting more as I continue on this new journey. Look for my next post soon!


AutoScroll in WinForms

Edit: When considering how to get TableLayoutPanels to AutoScroll…don’t. Use a SplitContainer whenever possible instead. Setting Dock.Top as noted below still applies, but do it in the SplitContainer and your life will be immensely better.

Every time I need to adjust AutoSize and/or AutoScroll in WinForms, it’s always a struggle to get things working and looking right. Hopefully this post will help me avoid this struggle the next time I have to do it.

To enable AutoScroll on a panel, AutoSize must be false. If the panel is contained within another panel, that panel’s AutoSize must also be false. If any of the panels are a TableLayoutPanel, the row containing the control that needs to AutoScroll must be set to Percent, 100. All controls within all of these panels must be set to Dock.Fill. Lastly, if adding controls dynamically to any of the panels, set them to Dock.Top.


Software Developer Learning Resources

Seems that more and more resources are sprouting up online for software developers, most of which are pretty basic courses. “Learn to code in Java”, “Create a website in 30 days”, “Android Development Basics”, and the like. Then there’s the university courses that allow you to earn college credit for cheap, or just audit the course for free. I’ve browsed a variety of these online resources, and my experience has been pretty mixed. Sometimes they help me, but other times I feel like a physical book is a much better choice.

Even within a single company, I find courses that I get a lot out of, and others that basically just waste my time. Pluralsight is one such company. I often see one of their course titles that interests me, but when I actually take the course it’s hit or miss whether or not it helps me out. I recently took a few SQL Server courses, but I didn’t get out of them what I was looking for. I was left feeling that my time would have been better spent reading books on the same topics.

My learning style may have a lot to do with whether or not online courses benefit me. For me, watching a course or listening to someone explain something isn’t enough for things to sink in. I like to see the big picture, play with things, and discover on my own how and why something works the way it does. Generally after I’ve spent some time in discovery, then I can watch someone else explain a topic and pick up on everything they’re saying.

Overall, physical books seem the best way for me to learn. With a book in front of me I can read something at the pace that I’m comfortable with, and if I think I missed something I can easily redirect my eyes to read it again. Then I can pause, think about what I’ve read, try it out on my computer, re-read it, reach an “aha” moment, then proceed on to the next section of the book.

I have a hard time doing the same with online resources, mainly because the controls are cumbersome. Sometimes I play the skip-forward, skip-back game, wasting time trying to find the spot that I need to get back to. Then by the time I find it, I’ve lost my context. I also find that most online instructors rush to get to the next section of the video. I don’t know if they’re on a strict time budget, if they’re concerned about keeping their audience’s attention, or what. The courses that benefit me the most are the ones that take time to explain the same concepts in different ways before moving on.

Those are just some of my thoughts regarding online developer resources. I’d love to hear other developers’ experiences though.

Which online resources have you tried?
Which online resources worked best for you?
Do you generally learn more from books or from videos?
Have you paid for courses that turned out to be a bust?
Are your choices dictated by your learning style, or can you adapt to any resource?
Are there particular resources out there that mesh well with multiple learning styles?
Have you taken any courses that should charge an arm and a leg because they’re so good?


Using images from another project in VS 2010

This post is a little self-serving, as I might be the only one who needs to know this in the future, but I’m sure I’ll forget all about this some day when I google this problem and find the answer in my own post.

Today I needed to use an image that was in one VS2010 project in a control of another project in the same solution. While you can certainly copy and paste the image into both resource files, I found a way that I can share the same file.

Step 1: Right-click the project you want the image in and choose ‘Unload Project’ from the context menu.
Step 2: Right-click the project again and choose ‘Edit [project name].csproj’.
Step 3: Find a <ItemGroup> section and insert the following code into it:

,EmbeddedResource Include="..\[project name]\Properties\Resources.resx">
    <Link>[assembly name].Properties.Resources.resx</Link>
    <Generator>ResXFileCodeGenerator</Generator>
    <LastGenOutput>[assembly name].Properties.Resources.Designer.cs</LastGenOutput>
    <CustomToolNamespace>[assembly name].Properties</CustomToolNamespace>
</EmbeddedResource>

Replace “[project name]” with the actual name of the project that contains the resource file of the image you want to use.
Replace “[assembly name]” with the actual name of the assembly for the project.

Step 4: Right-click the project and choose ‘Reload Project’. If prompted, allow VS to close the .csproj file.
Step 5: When you bring up the Image picker dialog for the control, you’ll now have the new resource file to choose from in the dropdown menu:

VS2010_Sharing_Project_Images