Lately I've been working on simplifying some of the more annoying tools for mundane tasks at work, usually administrative. The most annoying to extricate myself from, by far, has been Outlook's web-mail and to a lesser extent Thunderbird. Here I detail my current solution to calendaring.
My goal in approaching this problem was quite simple, because of how limited my use was to begin with - a quick look reveals I haven't needed to setup a meeting myself in at least 6 months. With that in mind, I set out to get a read-only view of my calendar within the tools I am currently using most, namely org-mode1.
There purportedly exists a REST API for exactly this kind of thing; Problem being, I was entirely flummoxed by the requirements around authentication and authorization. The requirements for OAuth are onerous enough to seemingly preclude the kind of scripted access I'm looking to implement, with the need to create and register an application to generate an app secret, to request an access token. All for a calendar I've published openly!
.ics
FileOption two, since the calendar is available openly is to parse the .ics
file. This is actually nearly workable using
ical2org.awk - the only
downside is actually in parsing recurring meeting successfully. While workable
in AWK, I took a look at re-implementing some of the functionality in Python
using the icalendar
package:
from icalendar import Calendar
with open('calendar.ics') as f:
cal = Calendar.from_ical(f.read())
events = [(component.get('summary'),
component.get('dtstart'),
component.get('dtend'))
for component in c.walk() if component.name == 'VEVENT']
Which turned out to be a breeze to use, but necessitates installing the package on my work machine in the system environment, or fiddling with packaging. I thought there must be a middle ground available through the published calendar and, eventually, found a way.
Viewing the publicly available calendar through the HTML/web interface is a painful thing, it can take over 50 seconds to load a single month-view of events for one person (me). It is actually confusing how poor the interface works, but it reaffirms my desire to distance myself from some of these irritations. Using Firefox's developer tools it's easy to see the calendar view is a thin HTML page loaded with JavaScript, used to parse well-formatted data received through an XHR. A POST is made to a URL like the following:
https://outlook.office365.com/owa/calendar/{public calendar id}/service.svc?action=FindItem&ID=-1&AC=1
With a JSON request body such as:
{
"__type": "FindItemJsonRequest:#Exchange",
"Header": {
"__type": "JsonRequestHeaders:#Exchange",
"RequestServerVersion": "Exchange2013",
"TimeZoneContext": {
"__type": "TimeZoneContext:#Exchange",
"TimeZoneDefinition": {
"__type": "TimeZoneDefinitionType:#Exchange",
"Id": "Eastern Standard Time"
}}
},
"Body": {
"__type": "FindItemRequest:#Exchange",
"ItemShape": {
"__type": "ItemResponseShape:#Exchange",
"BaseShape": "IdOnly",
"AdditionalProperties": [
{
"__type": "PropertyUri:#Exchange",
"FieldURI": "Start"
},
{
"__type": "PropertyUri:#Exchange",
"FieldURI": "End"
},
{
"__type": "PropertyUri:#Exchange",
"FieldURI": "Subject"
}]
},
"ParentFolderIds": [
{
"__type": "FolderId:#Exchange",
"Id": "...",
"ChangeKey": "..."
}],
"Traversal": "Shallow",
"Paging": {
"__type": "CalendarPageView:#Exchange",
"StartDate": "2017-02-26T00:00:00.001",
"EndDate": "2017-04-02T00:00:00.000"
...
It turns out there are several interesting things about this approach. The
first thing to notice is how the AdditionalProperties
array seems like a kind
of GraphQL field that responds with only the requested data, I've omitted any
information about "all-day events" or "charms" because frankly I don't
care. The second thing to note is the StartDate
and EndDate
are
configurable to request a specific range.
Armed with a unique, publicly available calendar ID, the Action:FindItem
header, and the appropriate POST body, the resulting data looks like:
{ ...
"Body": {
"ResponseMessages": {
"Items": [
{
"__type": "FindItemResponseMessage:#Exchange",
"ResponseCode": "NoError",
"ResponseClass": "Success",
"HighlightTerms": null,
"RootFolder": {
"IncludesLastItemInRange": true,
"TotalItemsInView": 44,
"Groups": null,
"Items": [
{
"__type": "CalendarItem:#Exchange",
"ItemId": {
"ChangeKey": "...",
"Id": "..."
},
"ParentFolderId": {
"Id": "...",
"ChangeKey": "..."
},
"Subject": "Standup",
"Sensitivity": "Normal",
"Start": "2017-02-27T11:30:00-05:00",
"End": "2017-02-27T11:45:00-05:00",
"IsAllDayEvent": null,
"FreeBusyType": "Busy",
"CalendarItemType": "Exception",
"Location": {
"DisplayName": "Room A",
...
}
},
...
]}}]}}}
Which has a nice benefit over the ICS file, which is that each event is
"de-normalized" in the sense that there is no need to interpret the semantics
of recurring events. Each item in the Items
array is a complete listing of
the event. Though not particularly relevant to my use case, the above response
body lacks one feature compared to the ICS file, which is a summary of the
event, from the invitation e-mail.
Because I eschewed the icalendar
package as being too much overhead to
maintain, I thought it only fitting to use only the standard library to fetch
and parse the calendar data.
The final step to re-implement the most pressing functionality of
ical2org.awk
is a simple text re-formatting into a stand-alone org-mode file.
That much means a leading *
for each event, with a lightly formatted
timestamp. I accomplished the whole wrapper and formatter with the following
script:
import re
import json
from datetime import datetime, timedelta
from urllib.error import HTTPError
from urllib.request import Request, urlopen
url = 'https://outlook.office365.com/owa/calendar/{calendar id}/service.svc?action=FindItem&ID=-1&AC=1'
headers = {'Content-Type': 'application/json',
'Action': 'FindItem'}
def format_post_data(json_file='request.json'):
with open(json_file) as f:
request_json = json.load(f)
today = datetime.now()
date_range = timedelta(days=7)
format = lambda t: datetime.strftime(t, "%Y-%m-%dT%H:%M:%S.%fZ")
request_json['Body']['Paging']['StartDate'] = format(today - date_range)
request_json['Body']['Paging']['EndDate'] = format(today + date_range)
return json.dumps(request_json)
def fetch_calendar(url, headers, request_data):
'''
request_data - bytes
headers - dict
'''
try:
req = Request(url, data=request_data, headers=headers)
response = urlopen(req).read()
except HTTPError as e:
# gross, but a quick way to debug API errors, dump (potentially HTML)
# responses into the traceback
e.msg += e.read().decode()
raise(e)
raw_data = json.loads(response)
events = ((_['Subject'], _['Location']['DisplayName'], _['Start'], _['End'])
for _ in
raw_data['Body']['ResponseMessages']['Items'][0]['RootFolder']['Items'])
return events
def format_entry(subject, location, start, end):
tidyup = lambda ts: (re.sub(r'(:\d{2}-\d{2}:\d{2}$)', '', ts)
.replace('T', ' '))
start = tidyup(start)
end = tidyup(end)
return ('* {}\n'
' <{}>--<{}>\n'
' {}\n').format(subject, start, end, location)
def format_agenda(events):
return('\n'.join(
(format_entry(subject, location, start, end)
for (subject, location, start, end) in events)))
request_data = format_post_data()
events = fetch_calendar(url, headers, request_data.encode())
print(format_agenda(events))
The end result is almost as uninteresting as you might imagine, I write it out to an agenda file that I don't typically access directly, instead invoking org-mode's "Week-agenda" to keep track of meetings to which I've been invited. This let's me track my own appointments locally, as I already do and removes the need for Thunderbird or Outlook to track the rest.
Week-agenda (W10):
Monday 6 March 2017 W10
Tuesday 7 March 2017
agenda: 11:30...... Daily standup
Wednesday 8 March 2017
agenda: 11:30...... Daily standup
agenda: 13:00...... 1:1
Thursday 9 March 2017
agenda: 11:00...... Open Enrollment Meeting
agenda: 11:30...... Daily standup
agenda: 12:00...... Tech Talk
agenda: 14:00...... Open Enrollment Meeting
Friday 10 March 2017
Saturday 11 March 2017
Sunday 12 March 2017
Of course, there are downsides to relying on an internal API such as this - as much was apparent the first day I tried actually using the above script. It seems the API is comically unreliable, throwing 500s on the majority of the requests made. I've no idea how Outlook manages to function when things are so unreliable, but I've got to wonder if it's not what contributes to the glacial load times. It looks like now might be as good a time as any to look into alternative approaches2.