Lisp, the Universe and Everything

2009-04-11

42

Кстати, об основополагающем вопросе... Это число, наверно, следующее по количеству упоминаний после фразы "Hello world" и foo-bar-baz в различных тюториалах по программированию. Когда-то я перевел отрывок из книги, в котором рассказывается о его истории, на русский. Вот от.
В целом, Дугласа Адамса, наверно, стоит почитать каждому -- в нем столько злободневного и в то же время вечного...

2009-04-05

correcting Transfer-encoding:chunked problem in nginx

While Igor Sysoev is thinking on how to properly (i.e. robustly and efficiently) implement handling of Content-encoding:chunked with unset Content-Length in nginx, I needed to hack a quick workaround (for those, who need it "here and now"). I decided to post it here for those who may need it:


--- old/ngx_http_request.c 2009-04-05 20:41:18.000000000 +0300
+++ new/ngx_http_request.c 2009-04-05 20:38:33.000000000 +0300
@@ -1403,16 +1403,6 @@
         return NGX_ERROR;
     }
 
-    if (r->headers_in.transfer_encoding
-        && ngx_strcasestrn(r->headers_in.transfer_encoding->value.data,
-                           "chunked", 7 - 1))
-    {
-        ngx_log_error(NGX_LOG_INFO, r->connection->log, 0,
-                      "client sent \"Transfer-Encoding: chunked\" header");
-        ngx_http_finalize_request(r, NGX_HTTP_LENGTH_REQUIRED);
-        return NGX_ERROR;
-    }
-
     if (r->headers_in.connection_type == NGX_HTTP_CONNECTION_KEEP_ALIVE) {
         if (r->headers_in.keep_alive) {
             r->headers_in.keep_alive_n =


--- old/ngx_http_request_body.c 2009-04-05 20:41:39.000000000 +0300
+++ new/ngx_http_request_body.c 2009-04-05 20:34:45.000000000 +0300
@@ -54,11 +54,6 @@
 
     r->request_body = rb;
 
-    if (r->headers_in.content_length_n < 0) {
-        post_handler(r);
-        return NGX_OK;
-    }
-
     clcf = ngx_http_get_module_loc_conf(r, ngx_http_core_module);
 
     if (r->headers_in.content_length_n == 0) {
@@ -180,7 +175,8 @@
 
     } else {
         b = NULL;
-        rb->rest = r->headers_in.content_length_n;
+        rb->rest = r->headers_in.content_length_n >= 0 ? r->headers_in.content_length_n
+                                                       : NGX_MAX_SIZE_T_VALUE;
         next = &rb->bufs;
     }

In the mean time I've sorted out for myself the server's internals. What can I say? First of all the event-driven architecture is complicated. Especially, when there's no high-level description, who is calling what. And especially under virtually complete absence of comments in the code.

And about C and Lisp (based on examining the source of the two HTTP-servers: nginx and Hunchentoot). Lisp is clay, shape whatever you want with all the connected advantages and shortcomings. C... There's a children's construction set from metal bars and corners. It's fun to play with them. And, generally, everything is quite clear, it's possible to assemble good, solid constructions. But making a vase with rounded corners — it's for the geeks...

исправляем проблему с Transfer-encoding:chunked в nginx

Пока Игорь Сысоев думает над тем, как правильно (т.е. надежно и быстродейственно) реализовать обработку Content-encoding:chunked при не заданной Content-Length в nginx, мне пришлось сделать workaround (для тех, кому это нужно "здесь и сейчас"). Решил его выложить, может, кому-то кроме меня это тоже будет нужно:


--- old/ngx_http_request.c 2009-04-05 20:41:18.000000000 +0300
+++ new/ngx_http_request.c 2009-04-05 20:38:33.000000000 +0300
@@ -1403,16 +1403,6 @@
         return NGX_ERROR;
     }
 
-    if (r->headers_in.transfer_encoding
-        && ngx_strcasestrn(r->headers_in.transfer_encoding->value.data,
-                           "chunked", 7 - 1))
-    {
-        ngx_log_error(NGX_LOG_INFO, r->connection->log, 0,
-                      "client sent \"Transfer-Encoding: chunked\" header");
-        ngx_http_finalize_request(r, NGX_HTTP_LENGTH_REQUIRED);
-        return NGX_ERROR;
-    }
-
     if (r->headers_in.connection_type == NGX_HTTP_CONNECTION_KEEP_ALIVE) {
         if (r->headers_in.keep_alive) {
             r->headers_in.keep_alive_n =


--- old/ngx_http_request_body.c 2009-04-05 20:41:39.000000000 +0300
+++ new/ngx_http_request_body.c 2009-04-05 20:34:45.000000000 +0300
@@ -54,11 +54,6 @@
 
     r->request_body = rb;
 
-    if (r->headers_in.content_length_n < 0) {
-        post_handler(r);
-        return NGX_OK;
-    }
-
     clcf = ngx_http_get_module_loc_conf(r, ngx_http_core_module);
 
     if (r->headers_in.content_length_n == 0) {
@@ -180,7 +175,8 @@
 
     } else {
         b = NULL;
-        rb->rest = r->headers_in.content_length_n;
+        rb->rest = r->headers_in.content_length_n >= 0 ? r->headers_in.content_length_n
+                                                       : NGX_MAX_SIZE_T_VALUE;
         next = &rb->bufs;
     }

Мимоходом, разобрался немного во внутренностях этого сервера. Что можно сказать? Во-первых, event-driven архитектура — это сложно. Особенно, когда нет общего описания, кто кого вызывает. И особенно при почти полном отсутствии комментариев в коде.

Ну и про С и Lisp (по итогам копания в коде двух HTTP-серверов: nginx и Hunchentoot). Lisp — это глина, лепи себе, что заблагорассудится, со всеми вытекающими плюсами и минусами. C... Есть такие конструторы из металлических планок и уголков. Интересно с ними играться. И, в общем-то, довольно понятно все, можно собрать хорошие, надежные конструкции. Но вот сделать из этого какую-нибудь вазу с закругленными углами — это для фанатов...

2009-03-23

CL-WHO macros

CL-WHO is a library, that performs HTML generation from s-expressions (which I'll call pseudo-html forms). As for now it has one major shortcoming due to the way the input is processed by the WITH-HTML-OUTPUT macro: it doesn't support using macros, that expand to pseudo-html forms. (And the reason for it is that the input is first subjected to rule-based transformations to allow intermixing pseudo-html and lisp forms).

Let's look at the simple example...

First we just process plain pseudo-html:

CL-USER> (with-html-output-to-string (s)
           (:p "a"))
"<p>a</p>"

Then we mix it with Lisp code with the help of using a special symbol HTM to hint WITH-HTML-OUTPUT, where to switch:

CL-USER> (with-html-output-to-string (s)
           (:ul (dolist (a '(1 2 3))
                  (htm (:li (str a))))))
"<ul><li>1</li><li>2</li><li>3</li></ul>"

Now we try to define a macro:

CL-USER> (defmacro test (a)
           `(htm (:li (str ,a))))
TEST
CL-USER> (with-html-output-to-string (s)
           (:ul (dolist (a '(1 2 3))
                  (test a))))

But the following code will cause UNDEFINED-FUNCTION error ("The function :LI is undefined.") Why? Because the translation of pseudo-html s-expressions would be already finished by the time of macroexpansion of TEST, so it will be expanded not inside (:ul (dolist (a '(1 2 3)) ...), but inside calls to WRITE-STRING.

It's definitely unlispy (not to be able to use macros anywhere you want :), so from the beginning of using CL-WHO I tried to find a work-around. (And I guess, many did the same). At least in the mailing list I turned out to be not the only one, who proposed something...

Below 3 solutions, that I've tried (in chronological order) are described. Each implements a distinct approach to combining pseudo-html forms with macros.

Delimited macroexpansion: introduce an additional special symbol (in the lines of already present HTM, STR, ESC and FMT) -- EMB, which will instruct CL-WHO to MACROEXPAND-1 the enclosed form during input transformation.

This is the example (based on the example from CL-WHO manual):

(defmacro embed-td (j)
  `(:td :bgcolor (if (oddp ,j) "pink" "green")
        (fmt "~@R" (1+ ,j))))

(with-html-output-to-string (*http-stream*)
 (:table :border 0 :cellpadding 4
         (loop for i below 25 by 5
               do (htm
                   (:tr :align "right"
                        (loop for j from i below (+ i 5)
                              do (htm (emb (embed-td j)))))))))

I implemented it by tweaking the WITH-HTML-OUTPUT underlying input transforming function TREE-TO-TEMPLATE:

(defun tree-to-template (tree)
  "Transforms an HTML tree into an intermediate format - mainly a
flattened list of strings. Utility function used by TREE-TO-COMMANDS-AUX."
  (loop for element in tree
        nconc (cond ((or (keywordp element)
                         (and (listp element)
                              (keywordp (first element)))
                         (and (listp element)
                              (listp (first element))
                              (keywordp (first (first element)))))
                      ;; normal tag
                      (process-tag element #'tree-to-template))
                     ((listp element)
                      ;; most likely a normal Lisp form (check if we
                      ;; have nested HTM subtrees) or an EMB form
                      (list
                       (if (eql (car element) 'emb)
                           (replace-htm (list 'htm (macroexpand-1 (cadr element)))
                                        #'tree-to-template)
                           (replace-htm element #'tree-to-template))))
                     (t
                      (if *indent*
                        (list +newline+ (n-spaces *indent*) element)

This is a simple and undertested solution, so it, probably, may fail in some corner cases. At least I couldn't make it reliably work, when I tried to combine it with PARENSCRIPT code (see http://common-lisp.net/pipermail/cl-who-devel/2008-July/000351.html). So it's just a direction (yet a quite possible one). But I decided to leave it aside to try the second approach...

Preliminary macroexpansion: don't modify CL-WHO, but build on top of it a macro infrastructure to expand the macros, before the whole form is passed to WITH-HTML-OUTPUT.

After thinking about a problem for some time I came to a conclusion, that it would be better to treat CL-WHO as just a good processor for static pseudo-html data, and on top of it build macros, which will allow eventually to add macroexpansion ability. For that let's consider such model: if we want to use macros, we define an "endpoint", at which they should be expanded (the function, that generates HTML after all). I called this endpoint WHO-PAGE. It will function as DEFMACRO, taking a backquote template, in which we can explicitly control evaluation of forms.
```
(defmacro def-who-page (name (&optional stream) pseudo-html-form)
 "Creates a function to generate an HTML page with the use of
WITH-HTML-OUTPUT macro, in which pseudo-html forms, constructed with
DEF-WHO-MACRO, can be embedded. If STREAM is nil/not supplied
WITH-HTML-OUTPUT-TO-STRING to a throwaway string will be used,
otherwise -- WITH-HTML-OUTPUT"
 `(macrolet ((who-tmp ()
               `(,@',(if stream
                         `(with-html-output (,stream nil :prologue t))
                         `(with-html-output-to-string
                            (,(gensym) nil :prologue t)))
                     ,,pseudo-html-form)))
    (defun ,name ()
      (who-tmp))))
```
The definition of this HTML-generation functions uses macrolet to expand the macros before passing the body to CL-WHO. And the special macros, that will be used inside the template, could be defined like this:
```
(defmacro def-who-macro (name (&rest args) pseudo-html-form)
  "A macro for use with CL-WHO's WITH-HTML-OUTPUT."
    `(defmacro ,name (,@args)
     `'(htm ,,pseudo-html-form)))
```
But that's not all. Sometimes we will need to evaluate the arguments, passed to who-macro (to be able to accept pseudo-html forms as arguments), so a complimentary utility can be added (it's named by the PARENSCRIPT naming convention):
```
(defmacro def-who-macro* (name (&rest args) pseudo-html-form)
  "Who-macro, which evaluates its arguments (like an ordinary function,
which it is in fact.
   Useful for defining syntactic constructs"
  `(defun ,name (,@args)
     ,pseudo-html-form))
```
The example usage can be this:
```
(def-who-macro title (for &optional (show-howto? t))
  "A title with a possibility to show howto-link by the side"
  `(:div :class "title"
         (:h2 :class "inline"
              "→ " ,(ie for)) " "
         (when ,show-howto?
           (how-to-use ,for)) (:br)))

(def-who-macro* user-page (title &key head-form onload body-form)
  `(:html
    (:head (:title ,(ie title))
    (:link :rel "stylesheet" :type "text/css"
                  :href "/css/user.css")
    (:script :src "/js/user.js" :type "text/javascript"
                    "")
    ,head-form)
    (:body :onload ,onload
    ,(header-box)
    (:table ,@(wide-class)
     ,body-form)
    ,(footer-box))))
```
I could successfully use this approach and have added some additional features to it (docstings and once-only evaluation, for example). Actually, I've built a whole dynamic web-site on it, and then faced a problem. As the code of the system grew quite complex with several layers of who-macros, SBCL started to lack resources to compile it. More specifically, upon re-compilation of the system it sometimes fell into heap dump. We tried to investigate the problem, but couldn't dig deep enough to find it's cause (or just lacked enough knowledge to properly debug it).

But the possibility of this solution indeed shows the nature of Lisp: virtually everything can be changed and all the boilerplate for that can be made transparent to the end-user. Alas it also shows, that this flexibility comes at a cost: it's pretty hard to build a robust and simple system, that will handle all the possible corner cases.

Transparent macroexpansion (idea and basic implementation by Victor Kryukov): the last approach (which I use now) was proposed in CL-WHO mailing list. It's in some sense the most obvious one: modify CL-WHO rules to accommodate macroexpansion.

Here's the implementation of this idea, that I use, which is tightly based on the one, proposed by Victor:

(defparameter *who-macros* (make-array 0 :adjustable t :fill-pointer t)
  "Vector of macros, that MACROEXPAND-TREE should expand.")

(defun macroexpand-tree (tree)
  "Recursively expand all macro from *WHO-MACROS* list in TREE."
  (apply-to-tree
   (lambda (subtree)
     (macroexpand-tree (macroexpand-1 subtree)))
   (lambda (subtree)
     (and (consp subtree)
   (find (first subtree) *who-macros*)))
   tree))

(defmacro def-internal-macro (name attrs &body body)
  "Define internal macro, that will be added to *WHO-MACROS*
and macroexpanded during W-H-O processing.
Other macros can be defined with DEF-INTERNAL-MACRO, but a better way
would be to add additional wrapper, that will handle such issues, as
multiple evaluation of macro arguments (frequently encountered) etc."
  `(eval-when (:compile-toplevel :load-toplevel :execute)
     (prog1 (defmacro ,name ,attrs
              ,@body)
       (unless (find ',name *who-macros*)
         ;; the earlier the macro is defined, the faster it will be found
         ;; (optimized for frequently used macros, like the inernal ones,
         ;; defined first)
         (vector-push-extend ',name *who-macros*)))))

;; basic who-macros

(def-internal-macro htm (&rest rest)
  "Defines macroexpasion for HTM special form."
  (tree-to-commands rest '*who-stream*))

(def-internal-macro str (form &rest rest)
  "Defines macroexpansion for STR special form."  
  (declare (ignore rest))
  (let ((result (gensym)))
    `(let ((,result ,form))
       (when ,result (princ ,result *who-stream*)))))

(def-internal-macro esc (form &rest rest)
  "Defines macroexpansion for ESC special form."
  (declare (ignore rest))
  (let ((result (gensym)))
    `(let ((,result ,form))
       (when ,form (write-string (escape-string ,result)
                                 *who-stream*)))))
(def-internal-macro fmt (form &rest rest)
  "Defines macroexpansion for FMT special form."
  `(format *who-stream* ,form ,@rest))

An interesting side-effect here is, that now HTM (et al.) becomes not just a symbol, but a normal macro, so this will work out of the box:

CL-USER> (with-html-output-to-string (s)
           (:ul (dolist (a '(1 2 3))
                  (test a))))

And a user-level DEF-WHO-MACRO, that incorporates ONCE-ONLY (which is regularly needed for this domain) and docstring support can be defined like this:

(defmacro def-who-macro (name attrs &body body)
  "Define a macro, that will be expanded by W-H-O.
/(Its name is added to *WHO-MACROS*).
Body is expected to consist of an optional doctring and declaration,
followed by a single backquoted WHO template.
All regular, optional and keyword arguments are wrapped in ONCE-ONLY
and can't contain pseudo-html forms."
  `(eval-when (:compile-toplevel :load-toplevel :execute)
     (unless (find ',name *who-macros*)
       (vector-push-extend ',name *who-macros*))
     (defmacro ,name ,attrs
       ,@(when (cdr body) ; docstring
           (butlast body))
       ,(let ((attrs (if-it (or (position '&rest attrs) (position '&body attrs))
                            (subseq attrs 0 it)
                            attrs)))
             (if attrs
                 `(once-only ,(mapcar #`(car (as 'list _))
                                      (remove-if #`(char= #\& (elt (format nil "~a"
                                                                           (car (mklist _)))
                                                                   0))
                                                 attrs))
                    `(htm ,,(last1 body)))
                 (last1 body))))))

I hope the above explanations and examples can be grasped easily. And moreover, that they will be useful for those, who are searching for the solution to the same problem. Still, if questions arise, be free to ask them.
If someone uses a different approach, I'm very interested to hear about it.

PS. Btw, Edi Weitz mentioned in the mailing list, that he is preparing a new version of the library. As he'd said, that addition of macroexpansion was on his TODO list, it will be interesting to see, what the new CL-WHO will hold...

2009-03-10

О самоподписных сертификатах

Написано для: habrahabr.ru

В связи с моим участием в проекте fin-ack.com постоянно сталкиваюсь с подобными замечаниями:

я не доверяю вашому самоподписному сертификату, почему вы не купите «нормальный» сертификат?

Как по мне, это один из случаев недопонимания и предрассудков, которых так много в отношении безопасности в Интернете. (Вроде знаменитых "Хакеров, крекеров, спамов, куки" :). Хочу разобрать его с двух точек зрения: как человека, некоторое время проработавшего в сфере защиты информации в банке и имевшего дело с большинством аспектов информационной безопасности, и как человека, занимающегося разработкой и развитием интернет-сервиса.

Но сперва отвечу на вопрос, почему у нас нет "нормального" сертификата? (На самом деле, с недавнего времени есть :) Самая главная причина в том, что в нашем списке приоритетов этот пункт стоял на N-ном месте, и только сейчас все N-1 предыдущих пунктов были выполнены. Когда работаешь над новым проектом, всегда приходится от чего-то отказываться, потому что ресурсы, прежде всего временные, ограничены...

А почему же он стоял аж на N-ном месте?
Во-первых, зачем вообще нужен сертификат SSL? Для того, чтобы зашифровать HTTP-соединение между браузером и сайтом, по которому будет передаваться пароль и какие-то другие конфиденциальные данные. Что изменится, если сертификат не подписан доверенным центром сертификации? Ничего! Соединение все равно будет зашифрованно точно также. Единственная возможная проблема: атака человек-посредине, которая в Интернете обычно является phishing'ом или pharming'ом.

При фишинге пользователя перенаправляют на сайт с похожим URL. При этом в браузере обязательно появится предупреждение про сертификат (такое же предупреждение появляется и при первом заходе на реальный сайт с самоподписным сертификатом).

В общем-то, в этой ситуации достаточно просто посмотреть к какому домену относится сертификат, и если это именно тот домен, на который вы хотели попасть, добавить сертификат в доверенные. После этого любое сообщение о недоверенном сертификате для данного сайта можно воспринимать как тревожный звоночек.

Отличие фарминга в том, что в данном случае пользователь попадет как-бы на тот сайт, на который хотел (судя по URL). Впрочем, ему также как и при фишинге будет показано сообщение о недоверенном сертификате.

Но многие вкладывают в сертификат SSL больший смысл:

...Если же сертификат выдан каким-нибудь Verisign-ом (для примера), то это некая "гарантия" что за этим сайтом стоит настоящая организация/частное лицо и уж как минимум "есть с кого спросить в случае чего". Т.е. вообще это как гарантия "серьезности" намерений владельцев сайта.

Мы прекрасно понимаем, что такое мнение имеет право на жизнь. Но ведь все не так просто. Ничто не мешает купить сертификат у Verisign или другого вендора на липовую контору или подставные личные данные. Они не могут проверить наличие у клиента юридических оснований выдавать себя за условные ООО "Рога и копыта" из г.Пермь, Россйская Федерация. Единственное, что проверяется при выпуске сертификата — это то, принадлежит ли вам домен, для которого вы его запрашиваете.
Так что, как по мне, покупка сертфиката у Verisign'а — это всего-навсего демонстрация того, что компания готова выбросить 500$ и несколько человеко-часов, а то и больше на утрясение всех организационных вопросов, вместо того, чтобы потратить это время и деньги на разработку новых возможностей или же реальное улучшение безопасности системы. Вообще, Verisign — это для банков. Есть другие вендоры, с которыми проще и дешевле (пример — ниже).

Но, самое главное, другое. Любая компьютерная система уязвима настолько, насколько уязвимо ее наименее защищенное звено. Хороший подход к безопасности — это всегда комплекс мер, в котором нужно учесть все риски и уделить каждому должное внимание. Попробую перечислить основные риски безопасноти пользовательских данных для стандартного Интернет-проекта, имеющего дело с личной информацией (веб-почта, личная бухгалтерия и пр.) в порядке их важности:

Не достаточно продуманная система доступа к конфиденциальным данным, которая имеет дыры

Проблемы в работе ПО, которое используется системой (ОС, веб-сервер, реализация протоколов шифрования), позволяющие осуществить взлом

Атаки типа человек-посредине, социальная инженерия

Фишинг/фарминг (человек-посредине), по моему мнению, один из наименее важных рисков, поскольку его намного труднее осуществить, его быстро перекроют и, поэтому, такая атака имеет смысл только для систем с очень большим количеством пользователей, из которых можно быстро выудить очень ценные данные (классический пример: интернет-банкинг). По сравнению с этим намного проще запустить сканер уязвимостей и обнаружить, что в системе используется старая версия OpenSSH или на Windows не установлена какая-то заплатка (к нам каждый день стучатся тысячи таких тестировщиков :). Или обнаружить какую-то XSS или SQL-injection уязвимость. Это не говоря о более сложных проблемах создания безопасных Интернет-систем, таких как, например, корректное использование сессий (и куки) для аутентификации. Именно этому нужно уделять внимание в первую очередь!

Еще один аспект безопасности, связанный с сертификатами. Будь он самоподписным или выданным Verisign'ом, все равно с ним ассоциирован секретный ключ, который нужно где-то хранить. Более того, он постоянно используется веб-сервером при открытии HTTPS-соединений, т.е. его нельзя применить один раз при включнии питания, сохранить на флешку и спрятать в сейф. Что будет, если кто-то завладеет ключом? (программист, который имел доступ к серверу, взломщик или еще кто-то). В идеале, этот ключ зашифрован, но при желании и наличии ресурсов его можно расшифровать (и сейчас это дешевле, чем организовать фишинг-атаку). А ведь мы не учли, что некоторые веб-сервера или реверс-прокси вообще не умеют работать с зашифрованными ключами. А еще ведь пароль может быть захардкожен где-то в тексте программы или скрипта, который ее запускает... Так что то, что на каком-то сайте красуется бирочка, что его SSL сертификат подписан Verisign, не дает никакой гарантии, что в один прекрасный день не появится фарминг-аналог, использующий тот же сертификат с украденным секретным ключом.

Тут я даже не вспоминаю о таких аспектах, связанных с системой PKI, как особенности ее поддержки на разных специфических платформах, таких как j2me...

Резюме: есть вещи, которые, в целом, правильные, но не всегда стоят затраченных усилий. Концентрация стартапов должна быть на другом, а мелочи, подобные "правильным" сертификатам должны идти вторым эшелоном. Сначала, как говорят американцы, нужно "get the product right". Всему свое время.

P.S. Собственно говоря, я понимаю, что чем пытатся изменить общественное мнение, проще под него подстроится, поэтому у нас уже есть "правильный" сертификат (время подошло). Кстати говоря, который стоит в 10 раз дешевле, чем большинство (спасибо GoDaddy!). Цель данной статьи в первую очередь в том, чтобы еще раз коснуться неисчерпаемой темы информационной безопасности в Интернете и постараться правильно расставить акценты в одном из ее аспектов.

2009-02-26

Курс "Системное программирование и операционные системы"

Читаю сейчас этот курс в КПИ и хотел найти желающих (так сказать, гуру в этой области) для проведения пары гостевых лекций. Если поставить себя на место руководителя компании, находящегося в постоянном поиске сотрудников, или даже проджект менеджера, которому тоже нужно участвовать в подборе персонала для своего проекта, мне это было бы интересно даже с чисто утилитарной точки зрения. Не говоря уже об общественной полезности :)

Попробовал обратиться через форум devua: http://www.developers.org.ua/forum/topic/384. Пока безрезультатно. Может, я не туда обращаюсь или у нас так просто не принято?

2009-02-17

Sending SMTP mail with UTF-8 characters from Common Lisp

Today I explored the topic of sending a base64-encoded SMTP message and it turned out to be rather tricky. I discovered, that for this task (if you don't rely on Franz's infrastructure) effectively 4 libraries should be used. And as they are very scarcely documented, I decided to write this short description.

CL-SMTP

Initially I knew very little about SMTP protocol (except for HELO and EHLO). So I started with the plain CL-SMTP functionality of SEND-EMAIL [1]. The function is not documented and has no own errors, it just re-signals the errors of the underlying USOCKET library. That's why it required some effort on my part to understand, why the following code produces USOCKET:UKNOWN-ERROR [2]:

(cl-smtp:send-email "localhost" "[email protected]" "[email protected]"
                    "subject"
                    "тест")
Explanation: Sending mail through the SMTP server on localhost from [email protected] to [email protected] with the body "тест".

It turned out, that the SMTP server just didn't accept the non-ascii characters in the body, because the default encoding is 7bit.
In the process I discovered the useful debugging feature of CL-SMTP: (setf cl-smtp::*debug* t) [3]. It will print the SMTP interaction log.

CL-MIME

So I asked Google and found this article by Hans Hübner, where he explains his enhancements of CL-SMTP (currently integrated in the codebase) and describes how to send attachments with it. But to properly apply his examples to my case I first had to learn a couple of things about MIME. In the example Hans uses multipart/mixed Content-type for sending a message with attachments. But it is not necessary for the simple task of sending a text message in UTF-8 charset. For that you can use text/plain Content-type and UTF-8 charset. But for non-ascii symbols to be accepted by the mail server they should be encoded (usually) in base64 (Content-encoding header). All this activities are handled with CL-MIME library. The library is quite self-explanatory so the lack of documentation doesn't hurt, except for a couple of moments.
First of all, properly formatted MIME text data is produced with the function PRINT-MIME [4], which takes the CLOS MIME object with the appropriately set fields. The problem is, that the generated data contains both MIME headers and the part, which should go into the message body. So the function's output can't be used as an argument to SEND-EMAIL, because the headers will go to the data section, and the mail-client won't consider them (which will result in decoded body). For this case (and other cases, when you need more control of the process of SMTP interaction) Hans has created a high-level macro WITH-SMTP-MAIL [5]. There's a little catch in it as well: unlike SEND-EMAIL it accepts the list of recipients (while the former — a sole recipient string).

CL-BASE64 & ARNESI

The second thing, which caused me most trouble, actually, was the tricky and once again undocumented handling of :CONTENT initarg of the MIME objects [6]. When you provide :ENCODING initarg, such as, primarily, :BASE64, the content part of the data, emitted with PRINT-MIME, will be subjected to the appropriate encoding (performed by CL-BASE64). The interesting thing is, that it will produce wrong output for UTF-8 strings. The proper argument format is an octet array. And you need a function to reduce the string to this format.

ARNESI is a useful library. It provides a lot of small utilities from different spheres. So I was glad to find out, that the needed function STRING-TO-OCTETS [7] is provided by it, because the lib was already utilized in my project.

It's worth mentioning, that if non-ascii characters are used inside MIME body, they can be sent as is. But, AFAIU, it's not so robust as in base64 encoded form.

Result

So the final code turned out to be like this:

(defun send-email (text &rest reciepients)
  "Generic send SMTP mail with some TEXT to RECIEPIENTS"
  (cl-smtp:with-smtp-mail (out "localhost" "[email protected]" reciepients)
    (cl-mime:print-mime out
                        (make-instance 'cl-mime:text-mime
                                       :encoding :base64 :charset "UTF-8"
                                       :content (arnesi:string-to-octets text :utf-8))
                        t t)))

Lessons learned

To send plain text ascii email use CL-SMTP:SEND-EMAIL

If USOCKET:UKNOWN-ERROR is signaled, most probably, the arguments are not properly formatted

For debugging (setf cl-smtp::*debug* t)

To efficiently use MIME utilize CL-SMTP:WITH-SMTP-EMAIL in conjunction with CL-MIME:PRINT-MIME

You need to supply an octet vector, not a string to CL-MIME:TEXT-MIME's :CONTENT initarg.

To break a UTF-8 string into octets use ARNESI:STRING-TO-OCTETS

2009-02-01

Nokia Locate Sensor

From idea:

Received: by 10.141.28.2 with HTTP; Mon, 22 Oct 2007 05:25:31 -0700 (PDT)
Message-ID: <[email protected]>
Date: Mon, 22 Oct 2007 15:25:31 +0300
From: Vsevolod
To: [email protected]
Subject: idea to help not forget phones
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_Part_8479_1626797.1193055931431"
Delivered-To: [email protected]

------=_Part_8479_1626797.1193055931431
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hello Jan,

My name is Vsevolod Dyomkin. I wanted to share a design idea related to
mobile phones with you, as an only person I can presently reach, who can
possibly facilitate its implementation.

Today I have seen your presentation at TED.com concerning the uses of mobile
phones. What interested me in it is the mention of places, which serve as
gravity centers for important carried items. In my opinion, the existence of
such places, in spite of the natural need for them, brings about one big
potential inconvenience…

Take my example, I work inside a big building and at my workplace the level
of mobile signal is very poor, so I'm forced to leave my phone in the other
part of the room where the connectivity is better. This sometimes leads to
unpleasant situations when I forget the phone, leaving the room and
building. Other example can be, when a person comes to a party and leaves
her phone in some place not to be distracted by it. Afterwards he'll pretty
probably forget about it. This may not only be the case with phones, but
also with other carried items as well.

To prevent such situations I've come up with the following idea: to have a
small device which informs you (like beeps), when you part, for example, 10
meters from the item. It will consist of two parts — a number of RFID tags
(in a form-factor of small round colored stickers), which can be sticked to
a mobile phone, a key, an id card etc. and a receiver/speaker, which can be
a charm on a keyring or a bangle, which beeps. The receiver can optionally
show the color of a sticker, which caused an alarm.

To me this is an example of delegation of mundane/error-prone tasks to
technology :-) — in this case the delegation of the necessity to flap one's
pockets...

If you find it interesting, fell free to contact me
Best regards
Vsevolod

...to implementation

How lexical scope is important

"Fexprs more flexible, powerful, easier to learn? (Newlisp vs CL)" @ c.l.l.
Rainer Joswig (with some participation from Kaz Kylheku and Pascal Bourguignon) on a practical example explain, what problems of dynamic scope (still used in the suggested "improved" newLisp, which turns out to be old, actually :) are solved by lexical scope.
Bonus: how to create lisp-style special global variables in C++ (and a discussion of what can be improved in CL in this regard)

2008-12-25

Технология надежной и удобной аутентификация для web

Написано для: habrahabr.ru
Время написания: январь 2008

...
На данный момент все способы аутентификации, применяемые в веб-приложениях, либо недостаточно безопасны, либо слишком неудобны в использовании. Именно из-за этого все еще не появилась глобальная система микроплатежей через интернет.

В чем именно неэффективность каждого из существующих способов?

Простой пароль: удобно, но есть несколько угроз, и самая главная даже не столько несанкционированное ознакомление с ним, сколько то, что примерно одна и та же комбинация логин/пароль может быть использована для множества разнообразных сервисов, часть из которых может быть недостаточно защищена.
Одноразовые пароли: безопасно и относительно удобно (но, все-таки, добавляется лишнее устройство), но довольно дорого.
Сертификаты цифровой подписи: безопасно, но очень неудобно (проблемы с кросс-платформенной поддержкой токенов), а также дорого.
Использование второго канала связи для подтверждения (обычно, мобильного телефона): относительно безопасно, относительно удобно, относительно масштабируемо (пока...).
OpenID: безопасно, но на данный момент труднодоступно из-за того, что у 99% людей нет доверенного веб-сервера.

Однако, сейчас уже можно замахнуться на глобальную систему аутентификации, если использовать сочетание ставших уже реальностью 3 феноменов:

IPv6;
OpenID;
стабильное интернет подключение с мобильного телефона/коммуникатора.

Вот она:
Каждый мобильный телефон, находясь в сети провайдера, будет постоянно подключен к интернету и иметь статический IPv6 адрес, а также DNS вида <номер телефона>.<домен оператора>. В каждом телефоне будет встроен сервис OpenID.
Таким образом, человеку нужно будет лишь каждое утро логиниться на свой телефон для того, чтобы иметь возможность автоматически аутентифицироваться на любом сайте. В такой системе, разумеется, появляется уязвимое место — сам телефон, в случае завладения которым, злоумышленники могут выдать себя за его владельца. Но тут, даже на первый взгляд, видится достаточно много способов защиты:

(не говоря о блокировке телефона по звонку оператору);
для каких-то чувствительных тракзакций (например, платежей) можно сделать дополнительную авторизацию в виде, например, пароля (вот уже и двухфакторная аутентификация);
можно добавить биометрическую аутентификацию или использование дополнительного токена, например, RFID-брелка, который человек может носить на связке ключей, шее или запястье, и который должен находится не дальше, например, 2 м от телефона, чтобы работал OpenID сервис.
Думаю, есть и другие разумные способы...

В такой системе мобильные операторы могут занять роль мобильных микро-банков, если платежи будут осуществляться прямо с личного счета у оператора, или же провайдеров аутентификационных услуг (уже сейчас появляются первые попытки реализовать этот подход, но с проприетарными системами аутентификации, которые не имеют перспектив масштабирования за рамки отдельных платежных систем).