I was recently given an interesting task at work. I was asked to export all blog posts for a given author from a Drupal site into a Microsoft Word document. At first, I wasn't sure how I was going to accomplish this, so I turned to Google and found a few PHP classes that purported to do exactly what I needed. However, a few false starts later, I was unable to get any of them to work.
That's when I came across
LiveDocX.
LiveDocX is a template-based SaaS solution that allows developers to create documents from data across disparate data sources.
It allows developers to create word processing documents by combining user-defined Microsoft Word templates with data from disparate data sources, such as XML files and databases. It is typically used to create professional, print-ready word processing documents in DOCX, DOC, RTF and PDF.
LiveDocx is a Web Service that can be easily integrated into any web application without installing or configuring any software on your server. Currently, the following programming languages are supported:
* ASP.NET
* PHP
As LiveDocx is strictly based on open standards, it is simple to add support for more programming languages. As long as SOAP (Simple Object Access Protocol) is available on the client-side system, LiveDocx runs on all operating systems and in all programming languages.
This looked to be the best solution for what I was attempting to do, and best of all, it was free. All I had to do was sign up for an account, and then I was free to begin coding my solution.
I knew that I wanted the solution to be dynamic; I didn't want to hard-code the author into my code. Instead, I wanted to be able to export any author's blog posts. So, step 1 was to create a form that would allow site administrators to find the author they are looking for. The form consists of a textfield with autocomplete functionality, and a select box with export options. For the purpose of this example, the only option is to export to MS Word (doc). However, LiveDocX also supports docx, rtf, and pdf.
<?php
function MYMODULE_blog_export_form() {
$form = array();
$form['export'] = array(
'#type' => 'fieldset',
'#title' => t('Blog Export Options'),
'#collapsed' => false,
'#collapsible' => false,
);
$form['export']['method'] = array(
'#type' => 'select',
'#title' => t('Export output type'),
'#options' => array(
'msword' => t('Microsoft Word'),
),
);
$form['export']['author'] = array(
'#type' => 'textfield',
'#title' => t('Author name'),
'#autocomplete_path' => 'admin/autocomplete/bloggers',
'#description' => t('Enter the name of the user.'),
);
$form['submit'] = array(
'#type' => 'submit',
'#value' => t('Submit'),
);
return $form;
}
?>
You'll notice that the textfield has an #autocomplete property, which is the path that executes the autocomplete function. This function is defined like this:
<?php
/**
* Menu callback function to provide autocomplete functionality for
* searching for users by username
*
* @param string $search_string
*/
function MYMODULE_autocomplete_bloggers($search_string) {
static $blogger_roles = array();
$result = db_query("SELECT r.rid FROM {role} r WHERE
r.name = '%s'", 'blogger');
while($role = db_fetch_object($result)) {
$blogger_roles[] = $role->rid;
}
$matches = array();
$result = db_query("SELECT u.uid, u.name FROM {users} u
LEFT JOIN {users_roles} ur ON ur.uid = u.uid
LEFT JOIN {role} r ON r.rid = ur.rid
WHERE u.name LIKE '%s%%' AND
r.rid IN (".join(',', $blogger_roles).")
LIMIT 50", $search_string);
while ($row = db_fetch_object($result)) {
$matches[$row->name] = $row->name;
}
print drupal_to_js($matches);
exit();
}
?>
Now that the autocomplete functionality was hooked up, it was time to define the form's submit handler. The submit handler makes use of Drupal's Batch API to define a batch process and execute it.
<?php
function MYMODULE_blog_export_form_submit($form, $form_state) {
$uid = db_result(db_query("SELECT uid FROM {users} WHERE name = '%s'", $form_state['values']['author']));
if($uid) {
$start_func = 'MYMODULE_blog_export_'.$form_state['values']['method'];
$finished_func = 'MYMODULE_blog_export_'.$form_state['values']['method'].'_batch_process_finished';
// Add a batch set with simple operations taking an argument.
$batch = array(
'title' => t('Blog Export'), // Not displayed.
'operations' => array(
array($start_func, array($uid)),
),
'finished' => $finished_func,
);
batch_set($batch);
batch_process('admin/content/blogs/export');
}
else {
drupal_set_message('An error occurred while trying to process this action.');
}
}
?>
The above code defines a batch process with a start function, and an end function. For clarity, the start function is responsible for finding all of the nodes for the specified author. The end function is responsible for sending those results to LiveDocX.
<?php
function MYMODULE_blog_export_msword($uid, &$context) {
$limit = 5;
$context['finished'] = 0;
if (!isset($context['sandbox']['progress'])) {
$max_nodes = db_result(db_query("SELECT count(n.nid) FROM {node} n WHERE n.uid = %d AND n.type = 'blog' ORDER BY nid ASC", $uid));
$context['sandbox']['progress'] = 0;
$context['sandbox']['current_node'] = 0;
$context['sandbox']['max'] = $max_nodes;
$context['sandbox']['results']['author'] = user_load(array('uid' => $uid));
$block_values = array();
$context['sandbox']['results']['block_values'] =& $block_values;
}
$nodes = array();
$result = db_query_range("SELECT n.nid, n.type FROM {node} n WHERE n.nid > %d AND n.uid = %d AND n.type = 'blog' ORDER BY nid ASC", $context['sandbox']['current_node'], $uid, 0, $limit);
while($row = db_fetch_object($result)) {
$nodes[$row->nid] = $row;
}
if(count($nodes) == 0) {
cache_set('famed:blog_export_results', 'cache', serialize($context['sandbox']['results']));
$context['finished'] = 1;
}
$context['message'] = t('Processing nodes authored by user %uid', array('%uid' => $uid));
foreach ($nodes as $node) {
// Process the node
$node = node_load($node->nid);
if($node) {
$content = node_view($node, false, true, false);
if($content) {
$context['sandbox']['results']['block_values'][] = array (
'post_title' => $node->title,
'content' => strip_tags($node->body),
'created' => date('Y-m-d h:m:s', $node->created),
'updated' => date('Y-m-d h:m:s', $node->changed),
'pub_status' => ($node->status == 1) ? 'Published' : 'Unpublished',
'tags' => $node->nodewords['keywords'],
);
}
}
// Update our progress information.
$context['message'] = t('Processing blog posts authored by user %uid', array('%uid' => $uid));
$context['results'][] = t('Processed node %node', array('%node' => $node->nid));
$context['sandbox']['progress']++;
$context['sandbox']['current_node'] = $node->nid;
}
// Inform the batch engine that we are not finished,
// and provide an estimation of the completion level we reached.
if ($context['sandbox']['progress'] != $context['sandbox']['max']) {
$context['finished'] = $context['sandbox']['progress'] / $context['sandbox']['max'];
}
}
?>
This code fetches all of the nodes from the database, and stores information about each node in the $context['sandbox']['results']['block_values'] array. When there are no more nodes to process, this array gets serialized and stored in Drupal's cache, so that the data can be used in the batch finished function.
The finished function is responsible for sending all of the data to the LiveDocX web service. It's important that you create your document template before trying to send data to the web service, as LiveDocX works like a mail merge. Your template will consist of named MailMerge fields, and merge blocks for repeating data.
<?php
/**
* Batch finished handler.
*/
function MYMODULE_blog_export_msword_batch_process_finished($success, $results, $operations) {
// Load the data from Drupal's cache
$cache = cache_get('famed:blog_export_results', 'cache');
// Unserialize the cache data
$blog_data = unserialize($cache->data);
cache_clear_all('famed:blog_export_results', 'cache');
if ($blog_data) {
// Turn up error reporting
error_reporting (E_ALL|E_STRICT);
// Turn off WSDL caching
ini_set ('soap.wsdl_cache_enabled', 0);
// Define credentials for LD
$credentials = array(
'username' => 'my_user_name',
'password' => 'my_password',
);
// SOAP WSDL endpoint
$endpoint = 'https://api.livedocx.com/1.2/mailmerge.asmx?WSDL';
// Define timezone
date_default_timezone_set('Europe/Berlin');
// Create a new instance of the SoapClient object
$soap = new SoapClient($endpoint);
$soap->LogIn(
array(
'username' => $credentials['username'],
'password' => $credentials['password']
)
);
// Upload template
$path_to_template = './'.drupal_get_path('module', 'MYMODULE').'/template.doc';
$data = file_get_contents($path_to_template);
if(empty($data)) {
drupal_set_message('Failed to read the template', 'error');
watchdog('famed', 'Failed to read the template', WATCHDOG_ERROR);
return;
}
$soap->SetLocalTemplate(array(
'template' => base64_encode($data),
'format' => 'doc'
));
$fieldValues = array (
'author' => $blog_data['author']->name,
'email' => $blog_data['author']->mail,
'title' => 'Blog Posts by '.$blog_data['author']->name,
);
/**
* In the template, these field values are used on the title page of the document,
* and in the header/footer of the doucment.
*/
$soap->SetFieldValues(array (
'fieldValues' => assocArrayToArrayOfArrayOfString($fieldValues)
));
// Block values is the repeating data, in this case, the contents of each blog post
$soap->SetBlockFieldValues(array(
'blockName' => 'blogpost',
'blockFieldValues' => multiAssocArrayToArrayOfArrayOfString($blog_data['block_values'])
));
// Build the document
$soap->CreateDocument();
// Get document as DOC
$result = $soap->RetrieveDocument(array(
'format' => 'doc'
));
// Fetch the document
$data = $result->RetrieveDocumentResult;
$filename = './sites/default/files/blog.doc';
if(file_exists($filename)) {
unlink($filename);
}
// Write the document to the filesystem
file_put_contents($filename, base64_decode($data));
// Force the browser to download the document
if(file_exists($filename)) {
header ("Content-type: octet/stream");
header ("Content-disposition: attachment; filename=blog.doc;");
header("Content-Length: ".filesize($filename));
readfile($filename);
exit;
}
else {
drupal_set_message('Failed to download the file', 'error');
}
}
else {
// An error occurred.
// $operations contains the operations that remained unprocessed.
$error_operation = reset($operations);
$message = t('An error occurred while processing %error_operation with arguments: @arguments', array('%error_operation' => $error_operation[0], '@arguments' => print_r($error_operation[1], TRUE)));
}
drupal_set_message($message);
}
?>
The data structures, which are sent to LiveDocx can be tricky to get right in PHP, so some additional functions are needed to massage the data that gets sent in the SetFieldValues() and SetBlockFieldValues() methods:
<?php
/**
* Convert a PHP assoc array to a SOAP array of array of string
*
* @param array $assoc
* @return array
*/
function assocArrayToArrayOfArrayOfString ($assoc) {
$arrayKeys = array_keys($assoc);
$arrayValues = array_values($assoc);
return array ($arrayKeys, $arrayValues);
}
/**
* Convert a PHP multi-depth assoc array to a SOAP array of array of array of string
*
* @param array $multi
* @return array
*/
function multiAssocArrayToArrayOfArrayOfString ($multi){
$arrayKeys = array_keys($multi[0]);
$arrayValues = array();
foreach ($multi as $v) {
$arrayValues[] = array_values($v);
}
$_arrayKeys = array();
$_arrayKeys[0] = $arrayKeys;
return array_merge($_arrayKeys, $arrayValues);
}
?>
The trickiest part for me was getting the template correct in order for the LiveDocX service to work properly. I didn't know how to create merge blocks; as it turns out, it's as simple as inserting bookmarks into your template that follow a specific naming convention:
blockstart_
blockend_
It's also important to know that LiveDocX is currently limited to having merge blocks defined in table cells. Future enhancements of the the service will support having merge blocks defined anywhere. I am excited for this to happen, as it will truly make this service a lot more flexible.
The full API for LiveDocX can be found here.
Recent comments
18 hours 16 min ago
1 week 6 days ago
1 week 6 days ago
1 week 6 days ago
1 week 6 days ago
1 week 6 days ago
2 weeks 1 day ago
4 weeks 1 day ago
10 weeks 3 days ago
16 weeks 6 days ago