Introduction

In this post we actually implement our canned response extension. This post is vertically long (mostly because of code samples and screen shots). Type along and take it slow. Extension development rewards patience and discipline. If you can’t be disciplined be patient. If you can’t be patient be disciplined. Whatever it is, don’t get overwhelmed by the length of this post.

For beginner developers, I hope also to show you my software development process for unfamiliar code APIs.

First Goal: Unprivileged Page Doing Something, Anything

Let’s create a new developer extension. I’m not going to cover the directory setup because it’s covered very well in the Extension tutorial.

You’re going to be creating your new extension (if on OSX) in a directory like ~/Library/Application\ Support/Google/Chrome/Default/Extensions. I’m going to call my dir canned_responses. After creating it, I’m going to cd into the directory.

Advanced: The Default is for the name of the Chrome user (Chrome supports multiple users! It’s a gift to developers!) you’re developing as. If you’re developing as user “Cthulu” your subdirectory would be Chrome/Cthulu/Extensions.

Sub-Step A: Set Up Version Control

We want to use version control for sharing our code and for back-up and undo purposes (if necessary), so my first step is to get to the canned_responses directory and set up git. I’m going to initialize a preliminary manifest.json file and make an initial commit:

echo "{}" > manifest.json; git init; git add . ; git commit -m "Initial commit"

This sets up the directory for git.

Sub-Step B: Start a New Feature Branch

I’m going to switch to a branch called basic-page-action-and-content-script-poc. I want to prove the concept (POC means proof-of-concept) of a basic page action and content script.

git checkout -b basic-page-action-and-content-script-poc

Sub-Step C: Load the “Developer Extension”

I’m going to next open the Extensions window in Chrome, enable developer mode via the checkbox, and “Load an Unpacked Extension.”

Loading an unpacked extension

I’ll navigate the Extensions “finder” to ~/Library/Application Support/Google/Chrome/Default/Extensions/canned_responses. When I do this, Chrome tells me that it couldn’t load a manifest.json.

Incomplete manifest error

But at least it’s looking in the right directory and I know what the next step must be: create a proper manifest.json.

Sub-Step D: Manifesting a manifest.json

Well the first thing I’m going to add is a manifest_version, because I just got an error about it. Looking at the manifest documentation I saw a few other sensible properties to provide, so I added them as well.

{
  "manifest_version": 2,
  "name": "Canned Response",
  "version": "0.0.1",
  "description": "For LinkedIn domain, offer canned reponses for saving time and carpals."
}

When I tell the Chrome to reload, things are working!

Loading an unpacked extension

Don’t sell this moment short, we should do a happy-dance. Let’s make a commit here.

git commit -a -m 'Provide load-able manifest.json file'

But we don’t have any code running. Remember this step’s purpose is to make a basic unprivileged page to do something. We’re not really doing anything…yet.

Sub-Step E: Add A Page Action

Recall our architecture, we want to create a “background page” that works in concert with a “content script.” Here’s where we get that “background page” in place.

Now here’s another place where Google documentation can get us confused. In the Background Page documentation, our first code sample doesn’t have a background.html but rather a background.js. It turns out that Chrome will auto-create a background.html file if it finds only a background.js.

So, as a “simplest thing that could work” test, we’re going to add an alert() in background.js.

ASIDE: I believe the model is “if you have multiple scripts, you might create background.html and <script> include them there. If you have only one short-ish script, then you can just use a background.js file.”

Let’s add an alert("razzle") there. We’ll first make a background script available, and then add the alert code.

Add to: manifest.json

{
...
  "background": {
    "scripts": ["background.js"],
    "persistent": true
  }
...
}

Create: background.js

alert("razzle");

We’ll discuss the persistent key in a moment.

Go to the Extensions tab and “reload” canned_response and see our alert() fire.

Getting razzled!

NOTE: For the rest of this document I’m going to call that step “Reload the plugin.” As an extension developer you’re going to do this a lot.

Let’s make a commit here:

git add .; git commit -am 'Add background.js: Runs when plugin (re)loads'

Sub-Step F: Parsing Google-ese

At the top of the Background Pages document, Google asks us to consider using an Event Page instead. This document describes how Chrome will load / unload extensions as needed. This sounds like good citizenship so we should do it. Let’s make the appropriate change so that we can start feeling the pain when our app is small.

The Event Page documentation says that the change is to merely change the persistent attribute in our manifest.json to false. The default is true. To make this explicit I typed it out in our previous commit.

Let’s make the change.

{
  ...
  "background": {
    "scripts": ["background.js"],
    "persistent": false
  }
	...
}

Well, OK, great! The documentation then tells me “The event page is loaded when it is “needed”, and unloaded when it goes idle again.” Well, I’m real interested in how to get my code to load.

Google docs on 4 modes for awakening Event Pages

What?!

OK, “when first installed” I get. When we do something from Extensions menu, our code is activated. Seems decent. Thanks, Chrome.

“The event page (wait, that’s what we’re writing) was listening for an event.” Hm, that seems sensible and a good fit for us. Any friendly, obvious examples of the most common task all browsers do, Google? Nope.

:-/ :-/ :-/

Those other two options seem more complex. I’ve not created a “content script” yet and I don’t even want to know about other views. Keep scrollin’.

Event Registration topic

We get to the “Event registration” page and it’s pretty severe with us. Basically it’s saying make sure your critical event (“How about you show me an example of one, already?”) doesn’t only fire during runtime.onInstalled. Well, gee, instead of admonishing me not to screw up, how about you help me do right, Google?

On the up-side, there’s a hint in here. It looks like events are named on*. That’s helpful. I’ll keep my eyes alert to on* things.

Now you might be tempted to give up, but it turns out Google has buried the gold we need in the Best Practices section in item #4.

ASIDE: Seriously? Calling it “Best Practices” makes it sound optional. Putting the most standard event at #4 in an “optional” section seems designed to confuse.

Event name that we can listen for

We’re told about an event-y and browser-y sounding event called webNavigation.onCompleted. Totally promising! Following its documentation we’re rewarded with that it’s “[f]ired when a document, including the resources it refers to, is completely loaded and initialized.”

onCompleted_snippet

Happy dance!

It then talks about filters. I don’t understand what it’s talking about here. It says “Filters array of object url”. Uh, are these, like, arguments? To something? How do I use these? GOOGLE WHAT DO YOU WANT FROM ME?

I mean, I get that they’re Objects that contain these cool properties. Sure. But, uh, what do I do with them. Explain it to me, GOOGLE.

OK, let’s not freak out. Let’s click on the “[event filtering]” link.

Filtered event sample

OK, and we finally see a helpful example:

chrome.webNavigation.onCommitted.addListener(function(e) {
  // ...
}, {url: [{hostSuffix: 'google.com'},
          {hostSuffix: 'google.com.au'}]});

ASIDE: What a royal pain. I’mma gonna have to file lots of doc bugs on this thing.

Let’s listen for a reasonable event (finish of web page load) on a very simple test site. I’m going to use my website but any simple site with an input field should work. Feel free to swap stevengharms.com with your simple website of choice. Make background.js look like this:

chrome.webNavigation.onCompleted.addListener(function(details) {
    chrome.tabs.executeScript(details.tabId, {
        code: 'alert("razzle")'
    });
}, {
  url: [{ "hostContains": "stevengharms.com" }]
});

This means that our razzle will happen only after page load of a stevengharms.com page, theoretically.

Reload the plugin. Visit a stevengharms.com page or reload it. Nothing. At this point I wasn’t sure if this was a “good nothing happened” or a “bad nothing happened.” I need data from the error console.

Sub-Step G: Debugging the Background Page

This is one of the most important techniques in this document! Visit the Extensions page and click on “background page” next to “Inspect views” and load our background content.

NOTE: Henceforth I will call this the “background page console.” We’ll be visiting this place a lot.

onCompleted error when API permissions are not present

It says…

“It can’t read property onCompleted.” The reason here is because we’ve not asked for the webNavigation API permission in our manifest.json file. As such the thing which ought bear onCompleted function isn’t there. Update the manifest.json with:

"permissions": [ "webNavigation" ]

Reload the plugin. Reload a page.

Still nothing. Consult the background page console error log.

onCompleted error when API permissions are not present

Here the error is telling us that we need to tell it which pages we want to watch for finishing.

"permissions": [ "webNavigation", "*://stevengharms.com/*"  ]

Reload the plugin, reload a stevengharms.com page…and razzle!

stevengharms.com razzles you

Try visiting another tab and you should not be razzled.

I’m going to paste the git diff here so you can verify your code has been changed appropriately.

diff --git a/background.js b/background.js
index 9a441ca..320ecda 100644
--- a/background.js
+++ b/background.js
@@ -1 +1,7 @@
-alert("razzle");
+chrome.webNavigation.onCompleted.addListener(function(details) {
+    chrome.tabs.executeScript(details.tabId, {
+        code: 'alert("razzle")'
+    });
+}, {
+  url: [{ "hostContains": "stevengharms.com" }]
+});
diff --git a/manifest.json b/manifest.json
index 1e1d6fe..d2cb2b5 100644
--- a/manifest.json
+++ b/manifest.json
@@ -6,6 +6,8 @@

   "background": {
     "scripts": ["background.js"],
-    "persistent": true
-  }
+    "persistent": false
+  },
+
+  "permissions": ["webNavigation", "*://stevengharms.com/*"]
 }

Awesome! We now have an “Event Page” that’s waiting for the right URL to do something. This is the “Page Action” part of our architecture. Let’s make a commit.

We’ll commit with:

git add .; git commit -m 'Move to event page activated at test site' -a

The other piece of our architecture that we want to get working at a basic level is a content script. Let’s get that in place.

I think it’s fair to say at this point we’ve got an unprivileged page doing something / anything. Let’s set a new goal.

Second Goal: Add a Content Script

While the documentation covered this material, in plain-speak here’s the essential truth: background scripts can do all the chrome.* API stuff, but only content scripts can manipulate the DOM and they don’t talk directly to each other.

You can verify this in the background console. If you add a debugger statement before the call to executeScript and issue document.getElementsByTagName("body")[0], you’ll see that the <body> is not the <body> on the web page where your code fires.

The <body> you see is in the DOM of the (generated) background.html! As such, we need to implement a message-passing based technique for communication between background.js and our new content script which will have the ability to talk to the “page DOM.”

Let’s create a “content script” called canned_response.js. We’ll move our alert() into it as a small, successful (we hope!) step forward. Doing so will “Programatically Inject” the content script into the filtered page.

background.js

chrome.webNavigation.onCompleted.addListener(function(details) {
  chrome.tabs.executeScript(details.tabId, {
    file: "canned_response.js"
  });
}, {
  url: [{ "hostContains": "stevengharms.com" }]
});

canned_response.js

alert("razzle");

Reload the plugin, reload the page and see if you still get razzled.

I got razzled! Time for a happy dance!

Now, if we’re right, the content script should have access to the DOM. Let’s ask our alert() to spit out the contents of the first h1 with class post-title.

alert(document.querySelector("h1.post-title").innerText);

If you’re using your own or an alternat site, be sure to change your querySelector

Once you have this working, it’s time to create a commit.

git add . ; git commit -vm'Create content script canned_response.js' -a

Before moving on to our next goak, we should talk briefly about executeScript.

Same as in the first goal, we’re only going to do the following when the page has completed loading. We then need to load the content script via a process called Programmatic Injection which is the responsibility of executeScript. This document has much of the information on how to do injection and work with the security model. It’s probably as important as the Content Script page and merits special attention.

chrome.tabs.executeScript takes 3 options:

  1. the calling tab id
  2. execution parameters
  3. and a callback that’s called when the file has been injected

Remember, we’re still writing JavaScript so thinking about asynchrony is something you’re going to need to think about and understand while coding. If you need work with something provided by a content script, you don’t want to call upon it until you’re sure the content script been loaded.

We’ve proven that background can programmatically inject new code and that that injected code can have awareness of the page DOM. This opens the way for background.js to do data processing and then hand it to the content script (canned_response.js) to display those data on the screen. Let’s prove that.

Third Goal: Establish We Can Send Data from background.js to Content Script and Vice Versa

Because this method is conceptually tricky, let’s send a simple string from the background page to the content script and alert with it.

To do this we need to:

  1. Set up a listener in the content script; set up event processing code
  2. Send a message from the background script; set up callback processing

…In the Content Script

chrome.runtime.onMessage.addListener( (message, sender, cb) => {
  alert(message.magic_word);
  cb(document.querySelector("h1.post-title").innerText);
});

We use the chrome.runtime.onMessage.addListener API to set up a listener.

Its sole argument is a callback that is going to receive a message, a sender, and a reverse-callback function (that the message sender gets to execute). The message typically should be something like a JSON object.

Here, when we see a message we’re going to take the datum from background-land (message.magic_word) and do something DOMmish with it (“fire an alert”). Via cb we’re going to pass data back from the DOM back into background-land.

…In the Background Script

chrome.webNavigation.onCompleted.addListener(function(details) {
  chrome.tabs.executeScript(details.tabId, {
    file: "canned_response.js"
  }, () => {
    chrome.tabs.sendMessage(details.tabId, { magic_word: "ROMY ZOMIE" }, txt => alert("called me back: " + txt))
  });
});

Remember what we said about asynchrony in step 2: I don’t want to sendMessage until I know my content script has been loaded. That’s why the sendMessage call happens in the executeScript-finished callback.

In the callback we pass to the content script a magic_word as well as a callback which can be used to return data to the background context that was passed from the DOM. We won’t use this in our app, but it’s good to see how it’s possible.

Also, remember you’re still writing JavaScript so you can see that I use the enclosed variable tabId attribute that was enclosed when the function was defined.

Give it a reload and see how things work.

The content script code displays a message from the background:

The content script code displays a message from the background

The callback is called and passed information from the contact script:

The callback is called and passed information from the contact script

Here’s our diff:

diff --git a/background.js b/background.js
index a7649a1..b42c5e2 100644
--- a/background.js
+++ b/background.js
@@ -1,7 +1,7 @@
 chrome.webNavigation.onCompleted.addListener(function(details) {
   chrome.tabs.executeScript(details.tabId, {
        file: "canned_response.js"
-  });
+  }, () => chrome.tabs.sendMessage(details.tabId, { magic_word: "ROMY ZOMIE" }, txt => alert("called me back: " + txt)) );
 }, {
   url: [{ "hostContains": "stevengharms.com" }]
 });
diff --git a/canned_response.js b/canned_response.js
index efb5511..634d501 100644
--- a/canned_response.js
+++ b/canned_response.js
@@ -1 +1,4 @@
-alert(document.querySelector("h1.post-title").innerText);
+chrome.runtime.onMessage.addListener( (message, sender, cb) => {
+  alert(message.magic_word);
+  cb(document.querySelector("h1.post-title").innerText);
+});

This looks great. It might be hard to believe, but we’ve made huge progress. Now that things can communicate our work is pretty clear:

  • Set up the menuUIs
  • Capture selection
  • Send selection to the DOM manager
  • Inject the canned response

It’s time to make a commit. With this, I think our proof-of-concept is done and we should merge this branch back into master.

git add .; git commit -am 'Basic message passing configured'; git checkout master; git merge --no-ff basic-page-action-and-content-script-poc

Next