Monday, May 26, 2008

Predicting Whether You Need to Reply to an Email

Every couple of months, I try to catch up on email research work from academia. This weekend, I found this paper [1] by Drezde et al. from UPenn.

The problem they want to solve is predicting whether an email in your inbox needs to be replied to or not. This is very relevant research as it would allow users to quickly scan their email for items they actually need to act on. The result could just be a simple "needs reply" indicator in the list of incoming items.


How do they do this? The authors used a number of attributes to classify emails, such as whether the user frequently replies to the author, whether the email contains any question marks, and combinations of words appearing in the text.

The results aren't quite there yet. For the best test corpus, they achieve are 77% recall - which means that they find 77% of the emails that need replying. However, they come in at 76% precision, which means that 24% of emails they mark as "needs reply" don’t actually need replying.

Thus, reply prediction remains exciting. I'm hoping that they come up with a better classifier, and that someone then turns this into an industrial-grade email application.

[1] Mark Dredze, Tova Brooks, Josh Carroll, Joshua Magarick, John Blitzer, Fernando Pereira: Intelligent Email: Reply and Attachment Prediction, Intelligent User Interfaces 2008, Spain. [PDF]

6 comments:

Mark Dredze said...

Thanks for the mention.

I don't recall if we put this in the paper or not, but the results we got were close to how well people did on this task (labeling each others email). Looking at many of the misclassifications, its hard to see getting much better.

However, the accuracy of this problem depends on the UI. If you just want to show if an email needs a reply in the UI (I built such an extension for thunderbird) then this may not be good enough. However, if you used it as some other method, perhaps as part of prioritization or less confident indicators, it may be useful. We had some ideas about managing expected replies: emails for which I am waiting for a reply. In that system you may not need high precision but higher accuracy.

john said...

really Nice thanks for pointing this research out !

regards

John Jones
http://www.johnjones.me.uk

Gabor said...

Mark -

Thanks for the reply - good points. If the results are as good as humans can classify email, it almost seems like the current results are an upper limit of what is possible.

Here at Xobni, Greg Duffy experimented with some SVN-based classifiers for reply prediction, and we got similar results as in your paper. Thus, a simple switch of algorithms doesn't help.

Let's chat sometime about different UI solutions for presenting the reply prediction labels.

Gabor

Kaitlin Duck Sherwood said...

I advocate sorting the messages by what group the sender was in (like BiFrost, but hopefully with more than five groups!), then colouring the message based on how you were addressed -- to you and only you, cc you, bcc, etc. This very quickly gives you a 2D view (where one of the axes is color) of who it came from and who it went to. The location in that 2-space is a really good indicator of whether you need to respond or not, it is cheap to calculate both dimensions, and the wetware is good at making the final decision.

I talk about how to group-by-sender-group in my books
http://emailoverload.com
but at the time, didn't think that colour-coding was such a big deal. Now I think it is a bigger deal, see
http://www.emailoverload.com/outlook/ColorCoding.php
for how to do that in Outlook.

(It's also MUCH easier to set up colour-code than to group-by-sender!)

健康之旅 said...

肺癌
肺癌症状
肺癌的症状
肺癌治疗
肺癌的治疗方法
肺癌转移
小细胞肺癌
非小细胞肺癌

胃癌
胃癌症状
胃癌治疗
胃癌转移
胃癌晚期

肝癌
肝癌的症状
肝癌症状
肝癌治疗
肝癌转移

食道癌
食道癌症状
食道癌治疗
食道癌转移


胰腺癌
胰腺癌症状
胰腺癌治疗
胰腺癌的治疗

贲门癌
贲门癌症状
贲门癌治疗
贲门癌转移


甲状腺癌
甲状腺癌症状
甲状腺癌治疗

直肠癌
直肠癌症状
直肠癌治疗
直肠癌转移
直肠癌转移症状

结肠癌
结肠癌症状
结肠癌治疗


宫颈癌
宫颈癌症状
宫颈癌治疗

子宫癌
子宫癌的症状
子宫癌晚期症状


卵巢癌
卵巢癌症状
卵巢癌治疗
卵巢癌转移

乳腺癌
乳腺癌症状
乳腺癌的症状
乳腺癌治疗
乳腺癌转移

肾癌
肾癌症状
肾癌治疗
肾癌转移
肾癌饮食

前列腺癌
前列腺癌症状
前列腺癌治疗

鼻咽癌

外阴癌

甲状腺癌

淋巴癌

恶性葡萄胎

内美通
子宫肌瘤的手术治疗
什么是宫颈癌

益肺清化颗粒
扶正消症胶囊
健脾益肾颗粒
养阴生血合剂
复方斑蝥胶囊
博生癌宁


癌症
肿瘤
肿瘤症状
肿瘤常识
怎样预防癌症
放疗的副作用
化疗的副作用

食道癌的早期症状
食道癌晚期症状
晚期食道癌治疗方法
食管癌晚期能活多久

肝癌晚期能活多久
肝癌病人的饮食
肝癌的症状
肝癌的诊断
肝癌晚期症状
肝癌晚期的症状

什么是胰腺癌

肺癌病人的饮食
中医药治疗肺癌
治疗肺癌的药物
肺癌的治疗方法
肺癌的肺外表现
肺癌早期症状
肺癌晚期症状
肺癌转移途径

食道癌晚期症状
食管癌的早期症状

胃癌术后化疗方案
胃癌化疗方案
胃癌晚期能活多久

乳腺癌的治疗
怎样预防乳腺癌
乳腺癌早期症状
乳腺癌的特征
怎样治疗乳腺癌
乳腺癌的自我检查
乳腺癌吃什么好
乳腺癌的症状
乳腺癌晚期能活多久

宫颈癌的症状
子宫内膜癌

直肠癌晚期能活多久
结肠癌的临床表现

胰腺癌晚期症状
恶性肿瘤症状
膀胱癌的化疗

Research Papers said...

Many institutions limit access to their online information. Making this information available will be an asset to all.