codedot | Entries tagged with programming

I should have learnt this before.

Let us suppose you have just made a pull request with about twenty commits: something borrowed, something blue, and something from Linus. Say, few commits have typos in their commit logs, few should have been merged into a single one, some should have been removed, a couple of them are so huge that you should split them into several ones, and except, maybe, one lucky change, the rest is to be completely rewritten.

In GitHub, it is however not common practice to deal with pull requests that way. After a pull request has been reviewed, it usually happens to be appended with nearly as many commits as in the original pull request. When the pull request has finally been merged, this in turn makes the master branch contain almost nothing but garbage.

Reviewing patch series and making the developer redo all the stuff from the very beginning, probably several times, might look as inapplicable approach to deal with pull requests. Besides, it does not guarantee the master branch not to contain any garbage. Of course, it is not a silver bullet. Nevertheless, it does help avoid more than 90% of junk, keeping the master branch log much more clear. That appears to be really helpful when you deal with a bug using git-bisect(1).

There exists at least the following scheme which makes GitHub-style pull requests do the trick, quite close to reviewing patch series in LKML.

Make a new pull request from bug2645 to master.
Discuss the changes and how to improve it until it is clear what to do for the next iteration.
Close the pull request in order to save the resulting review.
Fork a backup with git branch bug2645-backup bug2645 just in case.
Play with git rebase -i master (edit and squash), git reset HEAD^ (splitting commits), git add -p wtf.c (s and e), and git stash -k (test results before committing) to address the comments from the review.
When you are done, type git push -f origin bug2645 and start from the very beginning.

This scheme has been tested on an artificial task simulating huge and ugly patch series. Specifically, we cleared the master branch, and pretended that its backup is the development branch far away ahead of master. Then, we agreed on the rules to write commit logs in a different manner than it was before. Namely, all commit logs should have the form of 2645: update time stamps in msync(), where 2645 is the number of an issue on GitHub which corresponds to the applied changes. This way, one can always track which exactly bug implied each particular commit.

So, give it a try!

From the Preface

Have you ever...

wasted a lot of time coding the wrong algorithm?

used a data structure that was much too complicated?

tested a program but missed an obvious problem?

spent a day looking for a bug you should have found in five minutes?

needed to make a program run three times faster and use less memory?

struggled to move a program from a workstation to a PC or vice versa?

tried to make a modest change in someone else's program?

rewritten a program because you couldn't understand it?

Was it fun?

These things happen to programmers all the time. But dealing with such problems is often harder than it should be because topics like testing, debugging, portability, performance, design alternatives, and style -- the practice of programming -- are not usually the focus of computer science or programming courses. Most programmers learn them haphazardly as their experience grows, and a few never learn them at all.

In a world of enormous and intricate interfaces, constantly changing tools and languages and systems, and relentless pressure for more of everything, one can lose sight of the basic principles -- simplicity, clarity, generality -- that form the bedrock of good software. One can also overlook the value of tools and notations that mechanize some of software creation and thus enlist the computer in its own programming.

Our approach in this book is based on these underlying, interrelated principles, which apply at all levels of computing. These include simplicity, which keeps programs short and manageable; clarity, which makes sure they are easy to understand, for people as well as machines; generality, which means they work well in a broad range of situations and adapt well as new situations arise; and automation, which lets the machine do the work for us, freeing us from mundane tasks. By looking at computer programming in a variety of languages, from algorithms and data structures through design, debugging, testing, and performance improvement, we can illustrate universal engineering concepts that are independent of language, operating system, or programming paradigm.

This book comes from many years of experience writing and maintaining a lot of software, teaching programming courses, and working with a wide variety of programmers. We want to share lessons about practical issues, to pass on insights from our experience, and to suggest ways for programmers of all levels to be more proficient and productive.

http://cm.bell-labs.com/cm/cs/tpop/preface.html

В прошлом году я довольно пристально следил за выходом Google Chromebook, прежде всего за моделью от Samsung. Довольно быстро после выхода он появился и в UK, откуда уже можно было заказать по EU без таможни. Единственная проблема с заказом из UK, на самом деле, оказалась с клавиатурой. Дело в том, что я физиологически не принимаю никакую другую клавиатуру, кроме обычной американской, на которой обе клавиши Shift и клавиша Enter широкие, а не изуродованные неведомыми силами.

Заказ же из USA в EU появился значительно позже. Это интернет-магазин TigerTirect.com. Между банковским переводом и получением посылки в UPS прошла ровно одна неделя. Еще неделю я к нему присматривался. А на прошлых выходных я, наконец, решился на эксперимент: использовать только Chromebook для работы. Для этого мне и понадобился SSH-доступ со screen(1) при входе в систему. Мне нужно еще перенаправлять порт, чтобы легче тестировать Web-приложение, так что к файлам с ключами добавился и config с соответствующими строчками.

По большому счету, я годами не использовал ничего, кроме терминала с SSH и браузера. Все остальное выпрыгивало, пугало, мешало и всячески притягивало внимание, отвлекая от работы, как тамагочи. Этому место, может быть, на телефоне, но не на рабочей машине. Конечно, MacBook Pro производительный, однако я не видел смысла таскать за собой лишний килограмм ради каких-то редких случаев локальной сборки программного обеспечения сомнительного происхождения. Пусть греется сервер, а не ноутбук.

Итак, теперь у меня обычно одно окно браузера на весь экран с тремя пин-табами: почта (с нотификациями на рабочем столе), оболочка через SSH (на iMac дома) и собственно тестируемое приложение. Остальные вкладки используются как обычно: документация, GitHub, багтрекер, Wiki, и так далее. Используется нестабильная версия системы, однако, как я понял, они недавно обновили стабильную ветку одной из бета-версий, вместе с выпуском следующей версии компьютера от Samsung, которая теперь еще больше косит под Apple.

На всякий случай, рядом на рабочем столе все время лежал MacBook, но возвращаться к нему ни разу не пришлось и даже не захотелось. Настолько у меня положительный опыт работы с Chromebook. Не думал, что после перехода на Mac скажу это снова когда-нибудь, но лучшей рабочей станции у меня еще не было.

А вот этого я уже в других языках программирования как-то не видал. Эдакая «лиспо-лямбда» получается:

alexo@codedot:~$ wscat -c ws://localhost:8080/

connected (press CTRL+C to quit)

< (function () {
			socket.say = function(data) {
				data = JSON.stringify(data);
				this.send(data);
			};
		})(undefined)

< (function () {
			socket.say("Hello World!");
		})(undefined)
> "Hello World!"

< (function (data) {
			window.alert(data);
		})("Hello World!")

disconnected

alexo@codedot:~$

Кстати, wscat вчера починили.

Опубликовал свой первый пакет на свалке Node.js. Теперь запустить пример с «Hello World!», который будет доступен по локальному адресу, можно следующим образом:

alexo@codedot:/tmp$ npm install uniweb
npm http GET https://registry.npmjs.org/uniweb
npm http 200 https://registry.npmjs.org/uniweb
npm http GET https://registry.npmjs.org/ws
npm http 304 https://registry.npmjs.org/ws

> ws@0.4.13 preinstall /private/tmp/node_modules/uniweb/node_modules/ws
> make

node-waf configure build
Checking for program g++ or c++          : /usr/bin/g++ 
Checking for program cpp                 : /usr/bin/cpp 
Checking for program ar                  : /usr/bin/ar 
Checking for program ranlib              : /usr/bin/ranlib 
Checking for g++                         : ok  
Checking for node path                   : ok /usr/local/lib/node_modules 
Checking for node prefix                 : ok /usr/local 
'configure' finished successfully (0.183s)
Waf: Entering directory `/private/tmp/node_modules/uniweb/node_modules/ws/build'
[1/4] cxx: src/validation.cc -> build/Release/src/validation_1.o
[2/4] cxx: src/bufferutil.cc -> build/Release/src/bufferutil_2.o
[3/4] cxx_link: build/Release/src/validation_1.o -> build/Release/validation.node
[4/4] cxx_link: build/Release/src/bufferutil_2.o -> build/Release/bufferutil.node
Waf: Leaving directory `/private/tmp/node_modules/uniweb/node_modules/ws/build'
'build' finished successfully (1.050s)
npm http GET https://registry.npmjs.org/commander
npm http GET https://registry.npmjs.org/options
npm http 304 https://registry.npmjs.org/commander
npm http 304 https://registry.npmjs.org/options
uniweb@1.1.1 ./node_modules/uniweb
└── ws@0.4.13 (options@0.0.3, commander@0.5.2)
alexo@codedot:/tmp$ npm test uniweb

> uniweb@1.1.1 test /private/tmp/node_modules/uniweb
> node hello.js

Не считая примера, код пакета занимает всего 62 строки. Ниже комбинированный HTTP/HTTPS/WebSocket-сервер на его основе, который работает под рутом, так как использует порты 80 и 443, и ожидает SSL-сертификат в файлах key.pem и cert.pem:

var start = require("uniweb");
var read = require("fs").readFileSync;

function hello(socket) {
	socket.on("message", function(msg) {
		socket.send("alert(\"" + msg + "\");");
	});
	socket.send("socket.send(\"Hello World!\");");
}

start({
	handler: hello,
	domain: "example.com",
	key: read("key.pem"),
	cert: read("cert.pem")
});

Below are index.html and server.js sources. The server expects the index.html file as well as SSL key.pem and cert.pem to be located in the current directory.

<!doctype html>
<meta charset="utf-8">
<title></title>

<script>
var socket = new WebSocket("ws://localhost/");

socket.onmessage = function(msg) {
	eval(msg.data);
};
</script>

var read = require("fs").readFileSync;

function server(socket)
{
	socket.on("message", function(msg) {
		socket.send("alert(\"" + msg + "\");");
	});
	socket.send("socket.send(\"Hello World!\");");
}

function client(ws)
{
	var html = read("index.html", "utf8");

	return function(request, response) {
		response.writeHead(200, {
			"Content-Type": "text/html"
		});
		response.end(html.replace("ws", ws));
	};
}

function uniweb(host, ws, port)
{
	(new (require("ws").Server)({
		server: host
	})).on("connection", server);
	host.on("request", client(ws));
	host.listen(port);
}

uniweb(require("http").createServer(), "ws", 80);
uniweb(require("https").createServer({
	key: read("key.pem"),
	cert: read("cert.pem")
}), "wss", 443);

<!doctype html>
<meta charset="utf-8">
<title></title>

<script>
var socket = new WebSocket("wss://127.0.0.1:8888/");

socket.onmessage = function(msg) {
	eval(msg.data);
};
</script>

The above HTML5 code successfully validated by W3C, unlike the previous one, uses the secure WebSocket protocol. One can easily create a Node.js-based secure WebSocket server for such a client as follows. We assume files named key.pem and cert.pem to contain the private key and the corresponding certificate both PEM-formatted, and that server.js contains the source code from below. If you have the ws package installed using npm install ws, running node server.js will then start the server.

var fs = require("fs");
var https = require("https");
var ws = require("ws");

var host = {
	server: https.createServer({
		key: fs.readFileSync("key.pem"),
		cert: fs.readFileSync("cert.pem")
	})
};

(new ws.Server(host)).on("connection", function(socket) {
	socket.on("message", function(msg) {
		socket.send("window.alert(\"" + msg + "\");");
	});
	socket.send("socket.send(\"Hello World!\");");
});

host.server.listen(8888);

alexo@codedot:/tmp$ npm install websocket-server
npm http GET https://registry.npmjs.org/websocket-server
npm http 304 https://registry.npmjs.org/websocket-server
websocket-server@1.4.04 ./node_modules/websocket-server
alexo@codedot:/tmp$ cat >server.js
var server = require("websocket-server").createServer();

server.addListener("connection", function(connection) {
	var id = connection.id;

        connection.addListener("message", function(msg) {
		var feedback = "window.alert(\"" + msg + "\");";

                server.send(id, feedback);
        });

	server.send(id, "socket.send(\"Hello World!\");");
});

server.listen(8080);
alexo@codedot:/tmp$ node server.js

<!doctype html>
<meta charset="utf-8">
<title></title>

<script>
var socket = new WebSocket("ws://127.0.0.1:8080/");

socket.onmessage = function(msg) {
	eval(msg.data);
};
</script>

Assuming Node.js installed, one can run a broadcasting WebSocket server like this:

alexo@uniweb:~/chat$ npm install websocket-server
npm http GET https://registry.npmjs.org/websocket-server
npm http 304 https://registry.npmjs.org/websocket-server
websocket-server@1.4.04 ./node_modules/websocket-server
alexo@uniweb:~/chat$ cat >server.js 
var server = require("websocket-server").createServer();

server.addListener("connection", function(connection) {
	connection.addListener("message", function(msg) {
		server.broadcast(msg);
	});
});

server.listen(8080);
alexo@uniweb:~/chat$ node server.js

Then, we can implement a primitive multiuser whiteboard in HTML5 as follows:

( Client )

You better watch out when XSI is enabled!

alexo@codedot:/tmp$ getconf _XOPEN_UNIX
1
alexo@codedot:/tmp$ cat bessel.c 
#include <math.h>
#include <stdio.h>

main()
{
	{
		static *j0, *j1, *jn;

		printf("%p %p %p\n", j0, j1, jn);
	}

	{
		static *y0, *y1, *yn;

		printf("%p %p %p\n", y0, y1, yn);
	}

	printf("%p %p %p\n", j0, j1, jn);
	printf("%p %p %p\n", y0, y1, yn);
	return 0;
}
alexo@codedot:/tmp$ c99 -lm bessel.c 2>&- && ./a.out
0x0 0x0 0x0
0x0 0x0 0x0
0x106a70580 0x106a70e00 0x106a71680
0x106a709d0 0x106a71230 0x106a71bc0
alexo@codedot:/tmp$

А не распарсить ли мне FEN на Сях, раз такая пьянка? Распаршу:

#include <assert.h>
#include <ctype.h>
#include <stdio.h>

#define RANK "%8[12345678PNBRQKpnbrqk]"
#define ENPASS "%2[abcdefgh12345678-]"
#define FEN \
	RANK "/" RANK "/" RANK "/" RANK "/" \
	RANK "/" RANK "/" RANK "/" RANK " " \
	"%1[wb] %4[KQkq-] " ENPASS " %*2d %*d\n"
#define RANKS(board) \
	(board)[7], (board)[6], (board)[5], (board)[4], \
	(board)[3], (board)[2], (board)[1], (board)[0]

( Порнуха под катом не мягче )

Terminology used below is defined in the corresponding chapter of the XBD volume.

Task statement: implements IEEE 1003.1 with all the options disabled and system limits set to their minimum acceptable values.

Requirements:

Implementation is real-time in the sense of implementation-defined system clock which need not correspond to real-time clock.
Implementation supports exactly one user ID, namely zero. Every possible login name (see _POSIX_LOGIN_NAME_MAX) has the same user ID and its home directory set to /.
Implementation supports multiple logins.
System boot has finished exactly once before any login until a system crash. Any system crash shall be followed by system reboot.
Implementation has exactly one file system with its root at / which may be non-conforming.

Means:

One Google AppEngine application on the server's side.
HTML5 and ECMAScript with AJAX extension on any client's side.

Rationale: since C-Language Development Utilities are optional functionality, and the c99 utility is the only way to compile an application, System Interfaces are not required to be actually implemented. That is, mandatory Shell and Utilities are the only functionality which is required to be implemented, of course, in conformance with Base Definitions and System Interfaces. The file system may be non-conforming (for example, legacy file systems for which _POSIX_NO_TRUNC is false, case-insensitive file systems, or network file systems) because any Strictly Conforming POSIX Application is required to tolerate and permitted to adapt to the presence or absence of optional facilities, and the latter include non-conforming file systems. Taking into account mandatory IPv4 support and that no other hardware interfaces are specified, the only user interface for UNIX that can be implemented as a Strictly Conforming POSIX Application is a Web terminal. System clock is not required to correspond to wall clock in order to allow implementation to stop background processes when no login is being used.

Implementation described above is a combination of the following ideas: Web console as the only interface for UNIX and minimal POSIX implementation as a test suite for checking Strictly Conforming POSIX Applications. A Google AppEngine application called uniwebcore has been reserved for this task.

Вопросы для собеседования по системному программированию:

Что возвращает sleep()?
Как себя будет вести программа, которая вызывает fork() и после этого последовательно читает символы из стандартного входа, вводимые пользователем, выводя идентификатор процесса?
Как бы вы реализовали системный вызов fork(), который выполняет код дочернего процесса на удаленной машине?

Задача: написать программу на языке Си, которая выводит стандартный ввод в стандартный вывод в обратном порядке символов с использованием только функций getchar() и putchar() из стандартной библиотеки, не более O(n) памяти и O(n) времени, где n — длина входа.

Подсказка: программа может состоять из одной функции.

( Решение )

The following command pretends to be the most complicated simple command in Shell, without escape sequences, quoted expressions, word expansions and the rest of Shell grammar but simple commands.

The command itself is commented out and described below in a more readable script. The command or its parts could be of use for an interview quiz, for instance, as it touches all the relevant tokens specified by the standard and several non-obvious interpretations.

#0>& 1 >&- X= 2>>-3- =X= 4<<-5- =X 6<>7 8>|9 0<& 1 <& -

CMD="=X="
ARG="=X"

ERRFILE="-3-"
IOFILE="7"
OUTFILE="9"

X="" $CMD $ARG 2>>$ERRFILE 6<>$IOFILE 8>|$OUTFILE 4<<- 5-
	Here-document starts.

	This is a sample text.

	That's it.
5-

Есть искусственная сугубо формалистическая задача: позволить реализовать системный вызов с помощью чистого ANSI C. Последнее означает отсутствие нестандартных символьных констант и нестандартных конструкций, в том числе ассемблерных вставок.

По очевидным соображениям, варьируемой частью, которая и будет являться решением, может быть лишь компилятор, а не архитектура компьютера. Иными словами для решения задачи необходимо и достаточно представить компилятор языка ANSI C, нигде не противоречащий стандарту, который при этом позволял бы пользователю с помощью некоторых инструкций чистого ANSI C получать в качестве результатов компиляции системный вызовы, которые традиционно реализуются на языке ассемблера.

Нетрудно заключить, что решение задачи может быть найдено там, где стандарт не специфицирует поведение, оставляя свободу действий реализациям. При этом препроцессор с его #pragma не подходит по определению, а попытка рассмотреть стандартные библиотеки возвращает рекурсивно к исходной формулировке проблемы. Таким образом единственным остающимся в запасе местом оказывается ядро языка.

Логично взглянуть в сторону деления на нуль. Действительно, «если второй операнд [операций / и %] равен 0, то результат не определен». Следовательно системный вызов можно реализовать, по крайней мере, следующим образом:

void syscall(int num)
{
	num / 0;
}

Интересны и другие варианты решения данной задачи. Например,

blacklion предложил решение даже без требования какого-либо специфического компилятора, а лишь добавив еще один интерфейс системных вызовов в дополнение к обычной ассемблерной инструкции syscall. Решение заключается в том, чтобы не рассматривать разыменование нулевого указателя попыткой доступа к неразрешенной памяти, но в то же время перехватывать это обращение в MMU самой операционной системой. Функция для вызова системного вызова без магических констант и без операций, поведение которых не специфицировано в ANSI C, может выглядеть следующим образом:

void syscall(char *data)
{
	static volatile char **const syscallp;

	*syscallp = data;
}

Комментарии к предыдущему тексту, вызванные простым упоминанием языка Си без каких-либо нетривиальных замечаний, показали, что необсуждаемое в программировании действительно не вызывает никаких реакций у тех, кто знаком с практикой программирования, но с остальными оно может парадоксальным образом стать обсуждаемым, какие бы предосторожности от бессмысленной траты времени не были приняты.

Последний абзац текста содержал один жесткий критерий и один слабый. На самом деле такая аккуратность была излишней, и последний критерий можно было сделать единственным. Данная возможность объясняется тем, что везде, где можно получить практику программирования на языке Си, где-то рядом: снизу, сбоку или сверху, — находится SUS. Редкие случаи чисто академического хорошего знания этого языка, одного из самых простых, но обычно изучаемых лишь на чрезвычайно поверхностном уровне, опять же, не могут наделить практикой в достаточном объеме, чтобы судить о ней.

Именно поэтому, если человек даже не слышал об переносимом интерфейсе операционных систем и путает переносимость с кросс-платформенностью, ни о каких обсуждениях практики программирования на языке Си речи быть не может.

Используйте идиоматические конструкции везде, где только можно. Чтобы это делать, нужно хорошо знать язык и идиомы, ровно как и в случае с человеческими языками. Лексической и синтаксической корректности текста, его правильного форматирования и отсутствия фактических и логических ошибок недостаточно. Требуются также короткие, легко узнаваемые, четкие, ясные и лаконичные выражения.

К примеру, использование цикла for оправдано только в том случае, если в его заголовке используется только одна переменная специально для прохода по массивам и спискам. Не заставляйте программиста обращать внимание на этот заголовок. При виде for, читатель ожидает идиоматическую конструкцию, и ему должно быть полезно лишь тело цикла.

«Пользуйтесь длинными, содержательными именами для глобальных объектов и короткими — для локальных... Локальные переменные, используемые стандартными образом, вполне могут иметь очень короткие имена. Например, традиционно счетчики циклов обозначаются i и j, указатели — p и q, а строки — s и t. Используя вместо этих обозначений более длинные, можно не только ничего не выиграть, но даже проиграть. Рассмотрим такой пример:

?	for (theElementIndex = 0; theElementIndex < numberOfElements; theElementIndex++)
?		elementArray[theElementIndex] = theElementIndex;

А теперь перепишем его в следующем виде:

	for (i = 0; i < nelems; i++)
		elem[i] = i;

Программистов часто приучают применять длинные имена вне зависимости от контекста. Это ошибка — удобочитаемость часто достигается именно краткостью записи...

1.6. Комментарии

Не повторяйте очевидное... Не комментируйте плохой код — переписывайте его... написание хорошего кода имеет много общего с написанием хорошего литературного текста... Придерживайтесь стандарта... Программируйте базовыми средствами языка... Избегайте проблематичных возможностей языка... Пользуйтесь стандартными библиотеками... Пользуйтесь только средствами, имеющимися в наличии везде... Избегайте условной компиляции... Поддерживайте совместимость с существующими программами и данными» (Керниган и Пайк, «Практика программирования», перевод с английского под редакцией В. Л. Бродового, 2004).

Обсуждать программирование с кем-либо, не знакомым с содержанием вышеупомянутой книги (необязательно из нее самой) и языком программирования ANSI C в объеме учебника Кернигана и Ритчи, считаю абсолютно бессмысленным. В случае отсутствия такого противопоказания осмысленность обсуждений немедленно оказывается под серьезным сомнением, если собеседник не чувствует себя уверенным в работе со стандартом IEEE 1003.1 — основным приложением данного языка.

Хочу 1) интерпретируемый язык программирования 2) со следующим синтаксисом (пример кода) или, по крайней мере, похожий в нотации для выражений:

<text> ::= <term> | <assign> <text>;

<assign> ::= <ID> '=' <term> ';';

<term> ::= <appl> | <abstr>;

<abstr> ::= <ID> ':' <term> | <ID> ',' <abstr>;

<appl> ::= <atom> | <appl> <atom>;

<atom> ::= <ID> | '[' <term> ']' | '{' <term> '}' | '(' <term> ')';

3) бестиповый или со слабой типизацией, 4) возможно, с набором дельта-функций для быстрых операций на числах, строках и т.п., и, наконец, 5) с нормальным порядком вычисления.

Дайте.

The `test.mlc' file changed in MLC source codes. Now it represents sample syntactic sugar defined in a different way than earlier. Namely, the new version aims at more readable and transparent form of basic combinators. For each definition, there have also been added comments meant to provide some rationale.

( The resulting text )

Profile

Anton Salikhmetov

November 2018

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Syndicate

Page Summary

Style Credit

Style: Plain for Tabula Rasa

Expand Cut Tags

No cut tags

Page generated Dec. 14th, 2025 01:57 am

Anton Salikhmetov

Entries tagged with programming

Using Pull Requests as Patch Series

The Practice of Programming

Неделя работы с Chromebook

Мой второй Node.js-пакет

Мой первый Node.js-пакет

Combined HTTP/HTTPS/WebSocket Server

Minimal Universal Secure Web Client

Minimal Universal HTML5

Simple Multiuser WebSocket Whiteboard

Bessel Functions

Парсер FEN на Си

POSIX in the Cloud

Вопросы по UNIX

Задача на использование памяти в языке Си

The Most Complicated Simple Command in Shell

Системный вызов на ANSI C

Достаточность SUS-критерия

Необсуждаемое в программировании

Язык программирования моей мечты

MLC sample syntactic sugar changed