Category Archives: 软件

Zim的TeX公式中使用中文

找到_Equation.tex模板文件,改成这样。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
\documentclass[12pt]{article}
\pagestyle{empty}
\usepackage{CJKutf8}

\usepackage{amssymb}
\usepackage{amsmath}
\usepackage[usenames]{color}

\begin{document}
\begin{CJK}{UTF8}{gbsn
}

% No empty lines allowed in math block !
\begin{align*}
[% equation -%]
\end{align*}

\end{CJK}
\end{document
}

插入时,用

1
2
3
\end{align*}
这里是中文
\begin{align*
}

来跳过align*环境。或者直接把模板文件中的align*环境去掉,但是这样在多数情况下不方便。

另外,如果要插入表格,也可以使用上面的方法跳过align*环境。

KDE回收站没有空间的解决方法

症状:无法删除文件,删除文件时提示“回收站已到达其最大容量! 请手动清理回收站。”用Dolphin打开回收站,没有任何文件,并且回收站所在的分区有足够空间。

一种可能的原因是:~/.local/share/Trash/metadata文件中Size选项所对应的数值太大。

解决方法:将~/.local/share/Trash/metadata文件备份后删除。

Python处理pdf文件的包

  • pyPdf
    纯Python的PDF处理工具。
    主页:http://pybrary.net/pyPdf/
    主要功能:
    • 读取文档信息(标题,作者,……)
    • 逐页分割文件
    • 逐页合并文件
    • 裁剪页面
    • 合并多个页面为一个页面
    • 加密、解密PDF文件
  • ReportLab
    强大的生成PDF文件的库。
    主页:http://www.reportlab.com/software/opensource/rl-toolkit/
    主要功能:

    • 创建专业的PDF文件
    • Real document layout engine (Platypus) (这个不知道怎么翻译,大概是很NB的页面布局引擎)
    • 浮动对象,如段落,标题行,表格,图像,图形等
    • 支持嵌入Type-1字体或TTF字体
    • 支持亚洲,希伯来和阿拉伯字符
    • 支持任何流行格式的位图图像
    • 支持矢量图形
    • 包括一个可重用的基本图形库
    • 可扩展的widget库
    • Layed architecture, written in Python
    • 包含简单示例和更复杂的工具
    • 允许使用任何数据源
    • 源代码完全公开
    • 强大的社区支持
    • 跨平台
  • PDFMiner
    主要用于分析PDF文本内容的工具。
    主页:http://www.unixuser.org/~euske/python/pdfminer/index.html
    主要功能:

    • 纯Python(2.4以上版本)
    • 解析,分析和转换PDF文档
    • 支持PDF-1.7标准(几乎完美支持)
    • 支持CJK语言和竖向书写
    • 支持多种字体格式(Type1,TrueType,Type3和CID)
    • 基本的加密支持(RC4)
    • PDF转HTML(一个简单的Web转换器)
    • 摘要(TOC)抽取
    • 标记抽取内容
    • 重构原始布局

简单讲,如果只折腾现成的PDF文件,用pyPdf,如果要生成新内容的PDF文件用ReportLab,如果要分析现有PDF文件的内容,用PDFMiner。

不过,生成PDF方面,我更愿意使用LaTeX系列的工具,这样质量有保证,乱码之类的问题比较少。

Email自动应答查询器

标题不是特别准确,我想要的是这么个东西,比如考试完了,学生以学号1234为主题,发Email到abc这个邮箱,系统收到邮件后,在数据库中查询到成绩后,回复学生。

有没有这样的软件?

或者有没有这样的Wordpress插件,可以查询指定数据表中的指定字段,匹配成功则返回对应的另一字段的值?

[转]R programming books (updated)

原文地址:R programming books (updated) | (R news & tutorials)

via R-bloggers by csgillespie on 1/28/11


In a recent post, I asked for suggestions for introductory R computing books. In particular, I was looking for books that:

  • Assume no prior knowledge of programming.
  • Assume very little knowledge of statistics. For example, no regression.
  • Are cheap, since they are for undergraduate students.

Some of my cons aren’t really downsides as such. Rather, they just indicate that the books aren’t suitable for this particular audience. A prime example is “R in a Nutshell”.

I ended up recommending five books to the first year introductory R class.

Recommended Books

  • A first course in statistical programming with R (Braun & Murdoch)
    • Pros: I quite like this book (hence the reason I put it on my list). It has a nice collection of exercises, it “looks nice” and doesn’t assume knowledge of programming. It also doesn’t assume (or try to teach) any statistics.
    • Cons: When describing for loops and functions the examples aren’t very statistical. For example, it uses Fibonacci sequences in the while loop section and the sieve of Eratosthenes for if statements.
  • An introduction to R (Venables & Smith)
    • Pros: Simple, short and to the point. Free copies available. Money from the book goes to the R project.
    • Cons: More a R reference guide than a textbook.
  • A Beginner´s Guide to R by Zuur.
    • Pros: Assumes not prior knowledge. Proceeds through concepts slowly and carefully.
    • Cons: Proceeds through concepts very slowly and carefully.
  • R in a Nutshell by Adler.
    • I completely agree with a recent review by Robin Wilson: “Very comprehensive and very useful, but not good for a beginner. Great book though – definitely has a place on my bookshelf.”
    • Pros: An excellent reference.
    • Cons: Only suitable for students with a previous computer background.
  • Introduction to Scientific Programming and Simulation Using R by Jones, Maillardet and Robinson.
    • Pros: A nice book that teaches R programming. Similar to the Braun & Murdoch book.
    • Cons: A bit pricey in comparison to the other books

Books not being recommended

These books were mentioned in the comments of the previous post.

  • The Basics of S-PLUS by Krause & Olson.
    • Most students struggle with R. Introducing a similar, but slightly different language is too sadistic.
  • Software for Data Analysis: Programming with R by Chambers.
    • Assumed some previous statistical knowledge.
  • Bayesian Computation with R by Albert.
    • Not suitable for first year students who haven’t taken any previous statistics courses.
  • R Graphics by Paul Murrell
    • I know graphics are important, but a whole book for an undergraduate student might be too much. I did toy with the idea of recommending this book, but I thought that five recommendations were more than sufficient.
  • ggplot2 by Hadley Wickham.
    • Great book, but our students don’t encounter ggplot2 in their undergraduate course.

Online Resources

  • Introduction to Probability and Statistics by Kerns
    • Suitable for a combined R and statistics course. But I don’t really do much stats in this module.
  • The R Programming wikibook (a work in progress).
    • Will give the students this link.
  • Biological Data Analysis Using R by Rodney J. Dyer. Available under the CC license.
    • Nice resource. Possibly a little big for this course (I know that this is very picky, but I had to draw the line somewhere). Will probably use it for future courses.
  • Hadley Wickham’s devtools wiki (a work in progress).
    • Assumes a good working knowledge of R
  • The R Inferno by Patrick Burns
    • Good book, but too advanced for students who have never programmed before.
  • Introduction to S programming
    • It’s in french – this may or may not be a good thing depending on your point of view ;)

vim高手?请看vimgolf.com

vimgolf是一个在线比试vim技巧的网站。来看看题目有多变态,比如这一题,原文是

1
2
3
4
5
6
7
8
9
10
11
this line doesn’t have indentation
this line is indented with two spaces
this one has four
this other one has two
this one is indented with two spaces
this line has a four spaces indentation
this line needs six spaces
this line needs six spaces too
this line is back to four spaces
this line is finally indented with two spaces
this final line is not indented

要求按每一行中的指定的空格数缩进,也就是格式化成这样:

1
2
3
4
5
6
7
8
9
10
11
this line doesn’t have indentation
  this line is indented with two spaces
    this one has four
  this other one has two
  this one is indented with two spaces
    this line has a four spaces indentation
      this line needs six spaces
      this line needs six spaces too
    this line is back to four spaces
  this line is finally indented with two spaces
this final line is not indented

目前已经有三位朋友提交了解答,按键最少的一位只用了25次按键就搞定了。
如果你觉得你是vim高手,去挑战一下吧。

注:要提交结果,好象先要用twitter登录,得到一个数字ID,再用ruby装一个软件。我一直没装上,不知道是什么原因。

VIM中的批量编辑(转载)

原文地址:VIM中的批量编辑

VIM 中实现对文件的批量是通过同时打开多个文件实现的,即对 args 的操作。

在命令行下,通过:

 gvim a.txt b.txt c.txt

就可以打开多个文件,或者在 GVIM 中:

 :args *.txt

也可以打开多个文件。

之后,可以通过 :ls:args:buffers 来查看当前的 buffers 列表。

现在,要同时处理这些 buffers 中的各个文件,使用 argdo 就可以了。

 :argdo %s/teh/the/ge | update

argdo 后面跟命令就可以了,它的作用就是把后面的命令应用到 arg list 中的各个文件上。

e 这个 flag 是表示忽略错误,你不希望因为一个意外而让整个处理过程停下来吧。

update 这个命令是当文件发化变化时保存文件,就和 write 一样。这里不用它,难道你想之后手动挨个保存这些文件?

Fcitx 4.0 郑码码表

下载:Fcitx 4.0 郑码码表

官方网站上那个zm.mb中,词序比较乱,很多词的编码是错的,比如“浏览”的编码应该是vskm,而不是vskd,组词规则也是错的。

我修改这个是以winzm为基础,并添加了fcitx 4.0中自带的wbx中的词组。

btw: 2005年前后,用pascal写过一个win码表转到scim码表的mb2scim,花了不少时间,这次用的是python,轻松+愉快。上次差不多400多行代码,这次不到60行,当然完成的功能略有不同。
另外,如果3.6用着没啥毛病的朋友,就不要折腾了,4.0似乎还不是特别稳定。

pdftk抽取PDF页面

pdftk input.pdf cat 15-28 output outfile.pdf

使用pdftk修改PDF文件的文档信息

在Acrobat Reader中使用Ctrl+D,在Evince中使用Alt+Enter,都可以查看PDF文件的文档信息,包括作者,关键词,创建PDF所用的软件等信息。要删除或修改这一信息,可以使用pdftk的如下命令:

pdftk in.pdf update_info in.info output out.pdf

其中in.info文件包含了你想要的文档信息,这个文件有特殊的语法规则,与下面命令输出的report.txt的语法规则是相同的。

pdftk in.pdf dump_data output report.txt

实际使用时,可以先用第二个命令把in.pdf文件中的文档信息提取到report.txt文件中。再把report.txt中对应的键值修改成你所需要的内容,然后使用第一个命令修改回去。

注意,dump_data出来的信息中汉字会变成unicode编码形式,但是导入时,信息文件in.info中可以使用汉字。

无觅相关文章插件,快速提升流量